I often get the question from aspiring students on how to get started with Python programming and be ready for their Data Analytics journey. I have put together a set of guidelines that start the reader in a very simple yet direct way. One will dive right into how to use Jupyter Notebook and by following the steps mentioned in this article, in a few days time, one would be equipped with the necessary know-how to start the Data Analytics journey with a certain ease and confidence.
Read on to understand why I am calling Jupyter Notebook your best friend, and how to initiate the friendship, and then how to strengthen the relationship.
Why Use Jupyter Notebook
If you are an aspiring Data Scientist or an experienced Data Scientist who is migrating towards using Python for Data Analytics, Machine Learning, Deep Learning, you will start to use it as your foremost development environment.
The notebook is being used for Data Science (and other coding jobs) from students to industry practitioners, to the researchers, the senior scientists, and engineers.
What is Jupyter Notebook
It is the most popular development platform to be used for doing your data analysis using Python, R, Julia. It is a web-based interactive computing platform that allows users to author documents that combine live code, equations, narrative text, interactive dashboard and other rich media.
The software engineers had access to several IDEs (Integrated Development Environment) that were integrated with various tools like debugging, version control etc. However, the focus had never been interactive for the software industry. On the contrary, the scientific community was used to more flexible environments like Matlab and Mathematica. When Python introduced IPython, the move towards such an Interactive Development Environment for the engineers was a big leap towards working on an environment that allowed live code, narrative text, output fields, and visualizations being integrated together into documents that tell stories using code and data. The ease of being able to achieve this level of true integration pushed the software community to adopt IPython and subsequently, the work done by other languages towards its adoption pushed it to become the prominent common platform for various Data Science languages.
You may find this interesting read up on Project Jupyter by Fernando Pérez, program chair of JupyterCon.
Where does the name Jupyter Notebook come from
The name is derived from two places. The first is the planet Jupiter of course. Secondly, the core programming languages supported by Jupyter, are Julia, Python and R, and Jupyter is a formation from these three. In particular, the “y” in the middle of Jupyter was chosen to honor the Python heritage. Jupyter evolved from the IPython project (started in 2011), which focused on interactive computing in Python. IPython was an ethical commitment to building an open source project. When kernels for languages like Julia, R, were created in IPython, this cross-language usage forced to create a more independent platform, and in 2014, this was renamed to be a more inclusive “Jupyter Notebook”.
How to Begin to Use Jupyter Notebook
There are a few options:
- Visit this site hosted by Rackspace to quickly launch a temporary session of a Jupyter Notebook.
- Google Colaboratory – This is a recent launch by Google. It is a Jupyter Notebook environment that requires no setup to use, just like Google Doc, Google Slides. The biggest advantage is to be able to share files through Google Drive and work collaboratively. It is limited in its features right now, so look out for the further announcements
a.) Supporting Python 2.7 only, not Python 3
b.) No support for R or Scala
c.) Works with desktop version of Chrome
- Install the software on your machine. The best way (most flexible) is by downloading and installing the Anaconda distribution for Python 3.x version.
Anaconda is the most trusted distribution that supports the installation of 1,000+ Data Science packages and manage your packages, dependencies and environments—all with the single click of a button. Once you have downloaded and installed Anaconda, you can launch the Jupyter Notebook and start learning data load, slice and dice immediately. This works across all platforms: Linux, MacOS, Windows.
Basics of Jupyter Notebook terminologies
The Notebook combines three components:
- The notebook web application: An interactive editor for writing and running code interactively and authoring notebook documents.
- Kernels: Separate processes started by the notebook web application that runs the code
- Notebook documents: Documents that contain a representation of all content visible in the notebook. Each notebook document has its own kernel.
- inputs and outputs of the computations
- narrative text
- images, and rich media representations of objects
Notebooks consist of a linear sequence of cells. There are four basic cell types:
- Code cells: Input and output of live code that is run in the kernel.
- Markdown cells: Narrative text. Markdown allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML).
- Heading cells: 6 levels of hierarchical organization and formatting.
- Raw cells: Unformatted text that is included, without modification, when notebooks are converted to different formats using nbconvert.
Jupyter Notebook Tutorial Using Python
Now that the software is installed and you know the terminologies, are your ready to begin? Let’s get started on how to use it using Python:
It is recommended that you become familiar with the Python Basics using some basic set of tutorials using the notebook tutorials provided online. Remember not to copy the code, rather type your own code, do your markup and create some good initial narratives as this is the foundation that you build before jumping into data science.
1 – Start with this notebook to get comfortable with Python syntax and Arithmetic and Relational Operators http://nbviewer.jupyter.org/github/rajathkumarmp/Python-Lectures/blob/master/01.ipynb
2 – This notebook will get you into print and basics of string manipulations
3- This notebook will help you get comfortable with the data structures – List and Tuples
4 – Continue with String manipulation and use of Dictionaries
5 – Now you are ready to get into the programming aspects of conditions, loops and writing functions
How to Start Learning Basics of Jupyter Notebook?
Now that you are comfortable with the Jupyter Notebook and 101 of Python Programming, you are ready to start learning Data Analytics. Download the Data Science Course brochure and register for the course now!
The course will get you started on your journey to become Data Analyst, with a very hands-on, practice-based learning of all the modules, including Statistics and Machine Learning. The delivery is designed using the notebooks. You will learn the concepts, do Class Labs and Home Assignments.