FYI: 5 Free Data Science eBooks For Your Summer [winter break] Reading List
Published by 劉正山,
5 Free Data Science eBooks For Your Summer Reading List
Going somewhere nice for your summer holidays? Somewhere with a nice beach perhaps – Goa, Grand Cayman or Grimsby? Or a bustling city break? Wherever you’re going there’s sure to be long periods where you’ll sit for hours on end with little to do but read, so I thought I’d throw together a few free eBooks for your Kindle to while away the long hours in the airport, in a traffic jam or on the beach.
A mixture of books about data, analysis, statistics and R programming, they’re all very popular and are great for early-stage data scientists and will get your mental juices flowing with ideas about how to tackle your data for when you get back to your desk.
There’s even a book about data analysis for the life sciences in here, and a bonus book at the end about data cleaning (everybody likes bonuses, right?).
Best of all, they’re all free. My favourite price!
Well, in no particular order, here they are.
1. R Programming for Data Science
Author: Roger D. Peng
https://leanpub.com/rprogramming
R Programming for Data Science is about the fundamentals of R programming. Starting with the basics of R, you will learn how to manipulate datasets, write functions, and how to debug and optimise code.
According to Roger:
"Data science has taken the world by storm. Every field of study and area of business has been affected as people increasingly realize the value of the incredible quantities of data being generated. But to extract value from those data, one needs to be trained in the proper data science skills. The R programming language has become the de facto programming language for data science. Its flexibility, power, sophistication, and expressiveness have made it an invaluable tool for data scientists around the world."
With over 90,000 readers on LeanPub alone, this free ebook has proved a big hit worldwide. As an accompaniment to the Coursera R course, it is a good and easy read on the basics of R. With the programming theories written in layman’s terms, the skills taught in this book will lay the foundation to begin your journey learning data science.
The book is offered on the Pay-What-You-Want model, including free, but there is a minimum donation level if you want the accompanying datasets, R code and lecture videos.
2. The Elements of Data Analytic Style
Author: Jeff Leek
Jeff contends that data analysis is "at least as much art as it is science". This book is focused on the details of data analysis that sometimes fall through the cracks in traditional statistics classes and textbooks.
Like Roger, Jeff is also one of the co-developers of the Johns Hopkins Specialization in Data Science, and this book is a useful reference tool for people tasked with reading and critiquing data analyses.
The Elements of Data Analytic Style, currently being enjoyed by almost 60,000 LeanPub readers, is a concise introduction to all stages of data analysis, is a good starting point for data analysis newcomers and is also useful as a frequent look up tool to check that you’re on the right track.
The book is offered on the Pay-What-You-Want model, including free.
3. Regression Models for Data Science in R
Author: Brian Caffo
Although at the moment only 90% complete, this book has still attracted over 18,000 LeanPub readers and gives a brief, but rigorous, treatment of regression models from a practical perspective.
You will need a basic understanding of statistical concepts and R programming, and the book is intended for practicing Data Scientists but as long as you tick these boxes you should be fine.
After reading the book you should be able to perform multivariate regressions and understand their interpretations.
This book is also Pay-What-You-Want, including free, and there is a minimum donation level if you want the accompanying datasets, R code and lecture videos.
4. OpenIntro Statistics
Authors: David Diez and Mine Cetinkaya-Rundel
https://leanpub.com/openintro-statistics
Written by OpenIntro, whose mission is to make educational products that are free, transparent, and lower barriers to education, this book has been used as the course text in courses from community colleges to Ivy League colleges.
This book provides an excellent introduction to statistical analysis and model thinking as well as tools to challenge yourself along the way (quizes, tests, real world examples), and quickly accelerates from introduction to more complex statistics. There is also a website that provides some sample data to use in R and includes useful code snippets.
It is not so much a reference for statistics, but is a great book to learn how to use reason and logic about data, probability and statistical tools.
The book is offered on the Pay-What-You-Want model, including free, and helpfully, they also offer it as a tablet-friendly pdf, also free.
5. Data Analysis for the Life Sciences
Authors: Rafael A Irizarry and Michael I Love
https://leanpub.com/dataanalysisforthelifesciences
This is a book that is different from many statistical textbooks as it focuses less on mathematics and more on using a computer to perform data analysis. Instead of explaining the mathematics and theory, and then showing examples, the authors start with a practical data-related life science challenge. This book also includes the computer code that provides a solution to the problem and helps illustrate the concepts behind the solution giving you a better intuition for the concepts, the mathematics, and the theory.
This is a good introduction to statistics at the college level, and is particularly good for those entering the life sciences.
The book is offered on the Pay-What-You-Want model, including free.
Bonus Book: Practical Data Cleaning
Author: Lee Baker
Most of the books in this list are focussed towards statistics and R programming, so I thought I’d throw in something a little different for your summer reading list.
Practical Data Cleaning is a brief, but thorough introduction to the basics of data cleaning for beginners and the more experienced. Following the 19 tips outlined in the book will help you to get organised and avoid many of the most common pitfalls of data collection, cleaning, classification and data integrity.
There is also a free Microsoft Excel Practical Data Cleaning template to help you get a good start with your data.
This book is being offered for free, exclusive to the Data Science Central crowd.
***Latest News***
Practical Data Cleaning is now available as a free online video course
Summary
So there you have it – 5 free eBooks (plus a bonus book) for your summer reading.
I hope you enjoy them, wherever you go.
What do you think?
It would be great if you would leave brief reviews of these books in the comments below – I’m sure all the authors would appreciate your comments and shares.
Join the debate below and let me know your thoughts...
About the Author
Lee Baker is an award-winning software creator with a passion for turning data into a story.
A proud Yorkshireman, he now lives by the sparkling shores of the East Coast of Scotland. Physicist, statistician and programmer, child of the flower-power psychedelic ‘60s, it’s amazing he turned out so normal!
Turning his back on a promising academic career to do something more satisfying, as the CEO and co-founder of Chi-Squared Innovations he now works double the hours for half the pay and 10 times the stress - but 100 times the fun!
He also wanted to be rich, famous and good looking. Ah well...
PS - Don't forget to connect with me in Twitter: @eelrekab
Other DSC Articles by the same Author
- Free Alternatives to Excel for Data Cleaning
- Why Good Data Scientists are Worth the Big Bucks
- 50 Shades of Grey – The Psychology of a Data Scientist
- Statistics is Dead – Long Live Data Science…
Disclaimer: Practical Data Cleaning was written by the author of this blog post