A simple introduction to machine learning-with recommended learning materials

A simple introduction to machine learning-with recommended learning materials

How to get started with machine learning? There is currently no clear answer. This site is for beginners and recommends easy entry routes and learning materials for machine learning (including deep learning). After you get started, you will know which part of your knowledge you need to enhance, and you will also know what materials to look for to learn.

1. Formal learning route of machine learning/

How to learn machine learning well? The normal route looks like this:

1. Learn the basics of mathematics

Mathematical analysis (calculus), linear algebra, probability theory, statistics, applied statistics, numerical analysis, ordinary differential equations, partial differential equations, numerical partial differential equations, operations research, discrete mathematics, stochastic processes, stochastic partial differential equations, abstract algebra , Real variable functions, functional analysis, complex variable functions, mathematical modeling, topology, differential geometry, asymptotic analysis...

2. Learn classic machine learning books and tutorials

Classic books: Duda's "Model Classification", Mitchell's "Machine Learning", Li Hang's "Statistical Learning Methods", Zhou Zhihua's "Machine Learning"...

Wu Enda: "Machine Learning" open class, "Deep Learning" open class.

Lin Xuantian: "The Cornerstone of Machine Learning", "Machine Learning Techniques".


3. Learn programming languages

Proficient in programming languages such as Python, Java, R, MATLAB, and C++.

4. Read the paper

Learn English well, read classic papers, and read the latest machine learning papers, such as top conference papers, to master the latest technical directions.

5. Participate in actual projects/

Such as Dachang internship, participation in Kaggle, Tianchi and other data competitions...

After following the above courses, although you may not be able to become a leader in the industry, there is no problem with PhD graduation.

Problems with the above methods:/

  • Most learners do not study for scientific research, but hope to use machine learning as a tool.
  • Most learners have limited time, can't finish learning so much material, and don't know how to choose.
  • Some materials are too difficult. The author assumes that the learner has a certain foundation, and omits some steps. Many beginners feel like this:


Figure: A lot of information omitted key steps

In fact, most people learn the purpose of machine learning, as long as they can use machine learning algorithms and tools to solve some problems and understand the basic principles of the algorithms. They don't need to learn so deeply. The first one of the above learning routes persuaded many people to leave. Few people can learn the basics of mathematics as solidly as a Ph.D., and few people can finish reading classics and popular tutorials. They just want to get started quickly with machine learning.

In this case, this site recommends a quick entry route to machine learning.

2. a quick introduction to machine learning

1. Basic knowledge

Familiar with basic mathematics knowledge, the most important are the three courses of advanced mathematics , linear algebra , probability theory and mathematical statistics . These three courses should be compulsory for undergraduates. If you really forget, then read this article: "Machine Mathematics Foundation of Learning" , you can read this article for downloading mathematics materials. You don t need to understand all of them, but the basic formulas should be roughly understood. You can find formulas from the materials. There are two formula summary materials:

1) Mathematical foundation of machine learning.docx

(Chinese version, summarizes the formulas of advanced mathematics, linear algebra, probability theory and mathematical statistics)

2) Mathematical foundation of machine learning at Stanford University.pdf

(The original English material is very comprehensive. It is recommended that students who are good at English study this material directly).

I highly recommend that you lay a good foundation in mathematics, which determines the height of machine learning practitioners .

However, if you have too little learning time and want to get started with machine learning, you can learn one of the above two formula summary materials.

2. Machine learning tutorials

1) The best tutorial for getting started with machine learning

It should be an open class of "Machine Learning" by Teacher Wu Enda. This course is for beginners, focusing on practical applications and not focusing on mathematical derivation. This course started early, but it is still the hottest open machine learning course, with a very high score, and course supporting assignments (octave version).

Notes for studying this course:

  • The fifth chapter Octave tutorial, the eighteenth chapter application examples, these two chapters do not need to learn, a bit outdated.
  • You don t need to do the original octave homework, you can do the revised python homework.
  • If you watch the "Deep Learning" open class with teacher Wu Enda, you can directly learn the relevant content of "Deep Learning" for the fourth, fifth, and sixth weeks.
  • It is recommended to read this tutorial within three months. If you don t understand some parts, it s okay. You can look back when you need it later.
  • It is recommended to read this course together with the course notes. This site has provided note download

The course videos, notes, and python code assignments, please download in this article .

2) Machine learning cheat sheet

An article previously posted on this site "Machine Learning Cheat Sheet-(Understand Machine Learning Like Reciting TOEFL Words)"/

It is as convenient as making machine learning concepts into a cheat sheet memorizing TOEFL words! Machine learning all kinds of unrememberable concepts in minutes! This suggestion can be read in one week. Pay attention to skimming. It s okay if you don t understand some parts. Make a record and check it later when you need it.

3) Li Hang "Statistical Learning Methods"

10.statistical learning methods such as support vector machines, boosting, maximum entropy, and conditional random fields are introduced in detail. There are certain requirements for the foundation of mathematics. This is a classic in the classics. Many domestic online courses, Internet companies interviews, written examination questions, more or less, refer to the content of this book, which is a bit difficult for beginners. However, if you want to pass the interview written test, you should understand this book and try to derive the algorithm .

4) The best introductory tutorial for deep learning

Wu Enda's "Deep Learning" Open Class\

This video tutorial uses the simplest way to explain the main algorithms and frameworks of deep learning very clearly. The course comes with code assignments and test assignments. After learning, deep learning is considered an introduction. Study suggestions for each chapter:

  • Chapter 1: Neural Networks and Deep Learning

Part of the content is an upgraded version of the fourth and fifth weeks of the "Machine Learning" open class

  • Chapter 2: Improving Deep Neural Networks 

This part of the content basically does not overlap with the "Machine Learning" public class.

  • Chapter 3: Structured Machine Learning Project

Part of the content is an upgraded version of the sixth week of machine learning.

  • Chapter 4: Convolutional Neural Network 

This part is mainly used for image and target detection, which is equivalent to the simplification of the course taught by Stanford CS231n Deep Learning and Computer Vision-Li Feifei.

  • Chapter 5: Sequence Model 

This part is mainly used for natural language processing. Note that the symbols in the RNN/LSTM structure are a bit different from the original paper. The symbols in our usual blogs and papers are slightly different from those in the course of teacher Wu Enda.

The course videos, notes, and python code assignments, please download in this article .

5) Li Hongyi "Understanding Deep Learning in One Day" handout

The deep learning handouts by Professor Li Hongyi of National Taiwan University. This is the easiest introductory material I have ever seen for deep learning. The more than 300 pages of handouts can systematically explain the basic principles of deep learning in an easy-to-understand manner. It is as vivid as a machine learning cheat sheet . ./

It is recommended to browse this handout once in a few days to get a basic understanding of what deep learning is and what is the use of deep learning.

3. Learn programming languages

As it is just for getting started, only the programming language python is recommended .

The main code tool for machine learning is python. To what extent do you need to learn python? Personal feeling: getting started is the most important, at least you have to learn how to check Baidu if you encounter problems.

1) Python installation:

Regarding the python installation package, I recommend downloading Anaconda. Anaconda is a Python distribution for scientific computing. It supports Linux, Mac, and Windows systems. It provides package management and environment management functions, which can easily solve the coexistence of multiple versions of python. Switching and various third-party package installation issues. Download link : www.anaconda.com/download/push... (python 3.6 version)

IDE: recommended to use pycharm, the community version is free, download address: www.jetbrains.com/

2) Recommended materials for getting started with python

a. "Using python for data analysis"

This book contains a large number of practical cases. You will learn how to use various Python libraries (including NumPy, pandas, matplotlib, and IPython, etc.) to efficiently solve various data analysis problems.

This is the first python introductory material I read. If you run the code once, you can basically solve most of the problems of data analysis.

Download link: It is recommended to buy the book, and the source code can be searched on Baidu.

Note: The Chinese translation of the second edition has already been written. It is recommended to search and download.

b.Python entry notes

Author Li Jin, this is a jupyter notebook file, which demonstrates the main syntax of python and is worth recommending.

Download link : pan.baidu.com/s/1snmeqlR

c. Nanjing University python video tutorial

This tutorial is very recommended, and the main python syntax and commonly used libraries are basically covered.

Video download address: yun.baidu.com/s/1cCbERs Secret...

After reading these three materials, python is basically getting started, and machine learning libraries such as scikit-learn can be used to solve machine learning problems.

4) Learning of the main framework of deep learning

The main framework of deep learning, the most basic, should be Tensorflow and Keras . There are many tutorials, you can choose to learn them, this site recommends a simple way to get started:

a. Getting started with Tensorflow

The second course 3.11 of Wu Enda's "Deep Learning" open class introduces the basic usage of Tensorflow ( corresponding to note p251). After these usages, I can basically understand most of the code. Combining the code assignments of this course, I don t understand. Baidu can be found in local search.

b. Getting started with Keras

"Python Deep Learning" book and supporting code, "Python Deep Learning" is written by Fran ois Chollet, the father of Keras and current Google artificial intelligence researcher. It introduces in detail the use of Python and Keras for deep learning Exploring practice, including applications such as computer vision, natural language processing, and production models. The book contains more than 30 code examples, step-by-step explanations are detailed and thorough.

The author published the code on github, the code contains almost all the knowledge points of the book. After studying this book, readers will have the ability to build their own deep learning environment, build image recognition models, and generate images and text. But there is a small regret: the explanation and comments of the code are all in English, even friends with good English skills seem to be struggling.

This site has explained and annotated all the codes in Chinese, and downloaded some data sets required by the code (especially the "Cat and Dog War" data set), and localized some of the images, and all the codes have passed the test. (Please run in the order of the files, there are some associations before and after the code)./

This site believes that this book and code are the best tools for beginners to get started with deep learning and Keras.

Please click to download the e-book and Chinese annotation code .

4. Read the paper/

1) Learn English well and read some excellent papers

Read some classic papers selectively . English is really not good. Just enter the title of the paper and search Baidu. Many blogs will explain the classic papers in detail.

The key to reading the paper: reproduce the author's algorithm .

According to the paper, after the successful algorithm is reproduced, you can usually have a deep understanding of the paper. Classic papers can be reproduced, and github has them. The latest excellent papers can usually be searched for code.

2) Learn to organize papers

Under the classification and sorting of the papers that have been read, we recommend the essay management artifact Zotero , which is powerful, can take notes on papers, and supports synchronization between different computers.

5. Participate in actual projects

If you have an internship opportunity at a big factory, try your best to learn a lot.

If you don't have an internship opportunity, you can try to participate in the kaggle competition. You don't have to get a lot of rankings. You can search for the previous competitions, download the data, download the public code of others, and reproduce it.

There are similar competitions in China, such as Tianchi, DF, etc.

Usually, the top 1% of the results can be achieved in 2-3 competitions, and the coding ability is basically no problem.

However, it is not recommended to spend too much time in the game. Most of the time in the game is used for feature engineering. It may not be used in actual work, as long as the problem can be solved, the other time is used for learning.

6. Communicate with learners more

There are many ways to communicate, such as participating in academic activities, laboratory discussions, etc., but I think the most effective way to communicate is to join some academic discussion organizations, such as WeChat groups and QQ groups. "There is a sequence of learning and specialization". It's normal if you don't understand it. If you don't understand it, you should ask, "If you are a threesome, there must be my teacher."

Picture: QQ group is a good way to communicate

3. summary

Learn machine learning and lay a solid foundation in mathematics as much as possible. Machine learning practitioners do not have a solid foundation in mathematics and only use some tools and frameworks. This is equivalent to some martial artists who can only play routines. The layman thinks it is very powerful, but in actual combat, it must be Black nose and swollen face. It can be said that the foundation of mathematics is the ceiling for machine learning practitioners. Why do machine learning practitioners have higher academic qualifications and higher wages, which is usually positively related to the basic knowledge they master.

The method in this article is only suitable for quick start. After getting started, you will know which aspects of the deficiencies are, and you will find materials to learn by yourself.

The method in this article is for reference only.

Please follow and share /

Machine learning beginners/

QQ group: 654173748

Wonderful review of past issues\