Machine Learning for Engineers: Book List

In 2018, I posted a series of introductory, hands-on and more advanced book lists for engineers new to machine learning, to help develop basic knowledge of the fundamentals. This post provides an updated book list, along with a different grouping for the books.

Introduction

Working with others and in my own work I’ve found it increasingly useful to organise the books into four broad areas, rather than the introductory through advanced approach from before. The areas are, machine learning & algorithms, mathematics & statistics, data science & analysis, and, tools & frameworks.

 
Screenshot 2019-12-19 at 22.46.51.png
 

This allows delving based on what’s interesting and motivating, rather than climbing a ladder.

There are twenty-six books mentioned across the the areas, which is a lot, while being a fraction of what’s available. As a result, eight are marked with an asterisk as go-to texts, two for each area, with the remaining as useful and easy to get to later (or skip by, depending on your interest).

Please do look around for other options if these don't seem right for you! Some books click for people in ways others don't—for example I'm more or less math impaired, but Kuldeep Singh's book on linear algebra worked really for me once I found it and has allowed me to access other books and material. And if you just want the list, there’s a sheet here with the books and previous ones from the 2018 round.

As an engineer, you are going to be working with libraries and frameworks for the most part, but ideally there’s a grasp of the ideas underlying the technology as well. But first, some background books on the mathematics of machine learning.

Companion Mathematics

Work can get done in applied machine learning without mathematics. It is useful though for developing an intuition of what these systems are doing, helps with working through the introductory books, and is needed for the more advanced books. For what it's worth, when it comes to applied machine learning, it never feels like time spent on foundational/math side is wasted time. With that in mind, some good topics to know are:

Linear algebra and matrix manipulation.  Linear algebra is everywhere in machine learning and you’ll see it even in basic material. In practice libraries do the work for you, so you won't to need to code this stuff up, nonetheless algebra and matrix operations are useful to know as a a lot of real world machine (and especially deep) learning involves matrix swizzling.

Statistics and regression. Like linear algebra, these topics appear early on in machine learning texts. Regression, analysing things to see if there’s a relationship between variables (eg, weather and ice cream sales) appears as a basic form of prediction (weather may indeed predict ice cream sales).  

Probability. Probability becomes useful as machine learning is so often dealing with real world uncertainty, and machine learning results are often expressed as likelihoods. The Deep Learning book for example goes straight to probability after linear algebra in its concepts overview chapter.

Calculus. The amount of calculus worth knowing initially is small,  enough to get a handle on things like gradient descent (which uses differentiation to control the rate a model “learns” at) or backpropagation in neural networks. Again, libraries do this for you, but it’s nice to have an intuition for what’s going on.

On to the books:

* A Programmer's Introduction to Mathematics, by Jeremy Kun. Starts with polynomials and sets to ground things, and goes on to cover topics like graphs, linear algebra and calculus. The book does a good job of starting from a programmer’s frame of reference, moving toward what mathematicians are more used to, providing progressively detailed explanations and mostly avoids big conceptual jumps.

* Linear Algebra: Step by Step by Kuldeep Singh. There are a lot of linear algebra books, and this one I really like as a companion text and a self-study catchup, especially if it’s been a while since you’ve “done math”. As an added bonus it formats well on Kindle. If you want to go beyond this, Introduction to Linear Algebra by Gilbert Strang is widely recommended as a standard text (with online lectures) .

Probability: For the Enthusiastic Beginner by David Morin. This is a general, and extremely well written overview of probability. It's a real gem and highly accessible.

Bayes' Rule: A Tutorial Introduction to Bayesian Analysis by James Stone. This is a good crisp introduction to Bayesian probability. 

Calculus: An Intuitive and Physical Approach by Morris Kline, or the more comprehensive The Calculus Lifesaver by Adrian Banner (also online) would be enough to get going. If you're feeling retro, there's Calculus Made Easy from 1910, which is a really quite good and considerate introduction.

None of the books here deal directly with statistics: that’s because I’ve found statistics sufficiently covered in the data science books with a general math grounding being useful enough to get by. That said if you wanted something introductory just for stats , Statistics without Tears by Derek Rowntree or The Art of Statistics by David Spiegelhalter are finely written, nicely informal, and worth a look.

Finally, check out electronic editions before you buy them! Mathematics books, like programming books, don't always convert well.

Machine Learning & Algorithms

* The Hundred-Page Machine Learning Book, Andriy Burkov. Burkov makes a good opening decision, to catalog only the techniques that have been seen to work well, such as neural networks, trees, and the main classifier and clustering approaches. So while you won’t get a complete overview of the field, what’s here is relevant to engineering practice. The chapter “Basic Practices” is a particularly good overview of real world things that need to happen and get done to obtain useful predictions, such as feature engineering, regularisation, hyperparameter tuning, and how to assess performance.

* Neural Networks and Deep Learning: A Textbook, by Charu Aggarwal.  It covers basics of neural networks and goes beyond deep learning to describe other approaches including shallow networks and today less popular architectures (such as Kohonen Maps and Boltzmann Machines) which is useful to get a sense on just how a general purpose a technology neural networks are (or can be). The coverage of training and just how these things “learn” is welcome. It’s not a light book by any means, the author describes it as graduate level, but I’d argue it’s still more accessible than well-regarded texts that came before it. It has displaced Deep Learning (below) for me and for now as the default contemporary option for an in-depth treatment.

Bandit Algorithms for Website Optimization by John Myles White. This isn't a machine learning book, but acts as a good introduction to reinforcement learning. Via multi-armed bandits it presents non-mathematical overviews of approaches like epilison-greedy, softmax and upper confidence bounds. After reading this, you’re ready to look at reinforcement learning proper, an interesting area with lots of potential and worth knowing for the future.

Some books better suited as advanced or reference reading:

Artificial Intelligence:  A Modern Approach, by Stuart Russell and Peter Norvig. The AIMA book is a classic, and as a survey of artificial intelligence techniques is broader than the other books mentioned here—learning is just a section. The book also covers topics like search exploration, logic programming, agents and knowledge representation. The website has a bunch of complementary resources.

Reinforcement Learning: An Introduction 2nd ed, by Richard Sutton and Andrew Barto. The definitive text on Reinforcement Learning. The book is a nice step beyond Bandit Algorithms for Website Optimization above, and good background reading if you're playing around with something like Open AI's Gym or are working on optimisation problems like A/B testing [1].

Deep Learning, by Ian Goodfellow,‎ Yoshua Bengio,‎ and Aaron Courville. This has become a, arguably the, default text book for deep learning. That said I recommend Neural Networks and Deep Learning: A Textbook over it these days as more accessible. If you're comfortable with the mathematics, which you can preview in the opening chapter, go ahead and dive in.

Data Science & Analysis

* The Data Science Design Manual, by Steven S. Skiena. The best introductory text to data science I’ve come across. Skiena, well known for his immensely popular book on algorithms, is a lecturer and a teacher and it show— the book is designed as an undergrad or early graduate level introduction to the field. It’s extremely well organised and has motivating examples throughout. It’s also thorough, covering topics such as math basics, data preparation, statistics, visualisation, regression, ranking. clustering and classification. The book also has a comprehensive companion web site, with online lectures.

* An Introduction to Statistical Learning: with Applications in R (ISL) by Gareth James, Trevor Hastie, Robert Tibshirani, and Daniela Witten. This is a great introduction to learning techniques and approaches to selecting and evaluating them. The book self-describes itself a less formal and more accessible version of Elementary Statistical Learning. For a working engineer, a language other than R for the examples would have been ideal. R is a choice reflecting the book's origins (statistics rather than software engineering) but the examples are well done, and there's more on their website.

Data Smart: Using Data Science to Transform Information into Insight by John Foreman. Under the rubric of data science it covers k-means, naive bayes, nearest neighbours, linear regression among others. Incredibly it does all this just using spreadsheets, making it a highly accessible and grounded book. Also highly recommended as a gateway book for engineering managers and product managers.

Better suited as advanced or reference reading:

The Elements of Statistical Learning (ESL) by Trevor Hastie,  Robert Tibshirani, and Jerome Friedman. The is the standard advanced text book on statistical learning. The big brother to An Introduction to Statistical Learning above, the book is comprehensive and covers a lot of ground. The books unifies ML approaches around recurring challenges like bias and dimensionality. The math I reckon, is graduate level and very hard going (for me at least) unless you have a solid background. Even so, it's extremely well written, a fine book to dip into, lose time on a topic, and have your brain melted. Ideal as a reference book and for research.

Tools & Frameworks

Let’s look at some more hands-on programming books. First, let's talk about Python.

When compiling the original list in 2018, I mentioned there was an issue of it becoming "Machine Learning For Python Engineers". Looking into 2020, Python remains the dominant language for applied machine learning and data science for production applications. This is unlikely to change any time soon and is due to decades of effort by the Python community on an ecosystem of libraries for numerics and data manipulation, which act as building blocks for data science and machine learning frameworks. While many frameworks under the hood are C++ (I’ve regularly heard it said that Python is not the real driver), Python is the medium through which they're used, making it the most useful language for developing an understanding of machine learning applications if you don't have it already. 

* Python Machine Learning, 3rd Edition, by Sebastian Raschka and Vahid Mirjalili. This is the best overall introduction to machine learning I’ve seen targeted at programmers. Concepts are clearly explained with an almost perfect balance of theory, application, and framework use. The coverage of classifiers and neural networks in particular are thorough. Chapters 4 and 6, respectively on data preparation for training, and model training, are very welcome. The book even shows how to embed and call an ML model via HTTP. Python Machine Learning gets better with every release: the third edition, shipped in December 2019, adds coverage for TensorFlow 2, Generative Adversarial Networks (GANs) and Reinforcement Learning, all of which are good to see.

* Hands-On Machine Learning with Scikit-Learn and TensorFlow, 2nd Edition, by Aurélien Géron. The first half of this book covers common supervised approaches - classification, regression, support vector machines, trees and forests - using scikit-learn. The second half covers neural networks and deep learning with TensorFlow. There are two things Hands-On Machine Learning does well. First instead of trying to cover the gamut of techniques, it focuses on the ones that tend to get used. Second is the overview throughout the book on important gotchas - Chapter 3 briefly covers performance measurement,  chapter 2 discusses data preparation and chapter 8 is given over entirely to dimensionality reduction. If you're coming at machine learning from a software background, these are good to be aware of.

Deep Learning With Python by François Chollet. This covers neural networks via the popular Keras library designed by Chollet. The book makes a useful split between text and image examples and has a great practical overview of tensors, gradients and network structures for programmers in chapters 2 and 3. Deep Learning With Python is best programming centric introduction to deep learning specifically I’ve read so far. That said it’s a few years old and Keras has since become adopted by TensorFlow as a first class porcelain, so I would try Python Machine Learning and Hands-On Machine Learning first.

R in Action: Data Analysis and Graphics with R, by Robert Kabacoff. A good intro to R and suitable as a companion to An Introduction to Statistical Learning, as well as just picking up the language. The coverage of graphing via ggplot2 is good and the book walks you through using R for techniques such as regression, time series, sampling, clustering and classification among others. Here's the thing with R—it's a good interactive way to learn data science and analyse data sets, but you're unlikely to engineer online systems with it. R and its ecosystem remains focused on statistics and pure data science workflows more than building machine learning applications, so if your focus is engineering applications, investing in Python will likely have a better payoff. That said, R has arguably the most comprehensive set of libraries available for statistics, data analysis, and data visualisation via ggplot2. Another good and useful R book is Hands-On Programming with R  by Garrett Grolemund and Hadley Wickham, and if you want to go further into R as a tool for data analysis in its own right, then R for Data Science, also by Grolemund and Wickham is a worth a look.

Conclusion

A fair argument is that the machine learning application space is moving quickly, too quickly for books. Especially when it comes to programming, books are inevitably centred around frameworks, and so are much more likely to get outdated versus online media. I think this is true, but I believe the books in this list are fairly stable. In any case, I hope this list was useful, and the suggestion from the introduction stands—do look around for other options if these don't seem like the right ones for you. 


[1] If you asked me around the time of the first edition twenty years ago (also around the time I got my AI degree), what the most promising area in machine learning was, I would have said Reinforcement Learning. That hasn't quite worked out; there's a lot of interest in RL, but it's proven difficult to apply, notable exceptions like AlphaGo aside.