Deep Learning Software Installation Guide
Deep Learning Software Installation Guide
Deep Learning Software Installation Guide
Semester 2, 2024
Software Installation Guide
(drafted by Unit Coordinator Du Huynh)
If you intend to use your own laptop or your desktop at home, you will need to install Miniconda first.
Please go through this entire software installation guide before carrying out your installation.
If the installation process looks too daunting for you, then maybe you should consider using Google
Colab (see Section 5).
1 Installing Miniconda
Anaconda is an open-source distribution of the Python and R programming languages for many scientific
computing applications. It helps you simplify package management and deployment. The Anaconda
distribution includes data-science packages suitable for Windows, Linux, and macOS. You don’t need to
install Python separately as it is included with Anaconda. By default, the latest version of Python would
be installed when you install Anaconda. However, you might find that, for different projects, you will
need different versions of Python. So, it’s better to set up different Python environments for different
projects.
As of July 2024, the latest version of Python is 3.12.0. However, sometimes due to incompatibility issues
with other libraries (especially TensorFlow), you may need to use a lower version of Python. For this
unit, you need Python version ≥ 3.8. The latest version of TensorFlow that runs Ubuntu (and maybe
other platforms also) is 2.16.2.
Unfortunately, Anaconda includes many packages that you don’t actually need. Furthermore, as we will
set up a specific Python environment for the unit, the hundreds of packages downloaded and installed
in the base environment are unlikely to be used at all. To save disk space, you should install miniconda
instead. It is a significantly cut-down version of Anaconda and can be downloaded from the URL below:
https://docs.anaconda.com/miniconda/
You should choose the appropriate file that is relevant to your computer’s operating system. Ensure
that you download and install the 64-bit version1 as most Python packages are not available for 32-bit
processors these days. The installation process for miniconda should be very quick.
The installation process should bring in the following executable programs: conda, pip, pip3 (just an alias
of pip), pydoc, python, etc. By default, miniconda would be installed in your home directory. However,
you can also specify a directory where you want it to be installed. If you are not sure, just use the default
setting for everything. For instance, if you have chosen the default setting, then after the installation, you
should find the directory miniconda3 in your home directory and under miniconda3/bin, you should
see all the executable programs mentioned above.
You should open a terminal window and type:
∼/miniconda3/bin/conda init
1
If you are still using a 32-bit computer then you need to upgrade it. To find out whether your computer has 32-bit or 64-bit
processors: (i) On the Mac and Linux, type in a terminal window: uname -m, if you see something like x86 64 displayed, then
it uses 64-bit processors; if you see i686 or i386, then it uses 32-bit processors. (ii) On Windows, open File Explorer, right
click This PC and select Properties. In the popped-up window, look at the description under System type.
1
This will run the conda program to initialise the PATH environment variable for you. From then on, for
any programs you want to run, you only need to type the program name (without the path) in the terminal
window that you open.
Try to reboot your computer or maybe just opening a new terminal window would be sufficient. In the
terminal window, type:
If you see the error message “command not found.” then it means your PATH environment variable has
not been set up properly. Otherwise, you should see something like:
# conda environments:
#
base * /home/du/miniconda3
By default, miniconda installs a default Python environment called base (this is the name of the environ-
ment). The “*” symbol means that it is currently the active environment.
This installation procedure requires you to enter the installation commands yourself. However, it is more
interactive and, if the installation failed in any step (e.g., version incompatibility among packages), then
you can identify it. The following versions are known to be compatible (after some detailed investiga-
tion):
By default, if no version number is specified for a package, conda and pip would try to install the latest
version.
In the procedure below, we assume that the computer does not have a GPU and we will go for the latest
version of Python.
In your terminal window, type the following commands one by one:
2
1. (optional) upgrade conda to the latest version:
conda update -n base conda
This command is required only if you had an older version of conda previously installed.
2. create an environment called cits5017-2024 for the CITS5017 Deep Learning unit:
conda create --name cits5017-2024
3. activate the environment so that the packages installed by subsequent conda install commands are
stored there:
conda activate cits5017-2024
Note that this step is very important. If you omit it, then the subsequent installation commands
will put all the packages in the default base environment.
Type:
conda env list
Now you should see the new environment listed alongside the base environment, for example,
something like the following:
# conda environments:
#
base /home/du/miniconda3
cits5017-2024 * /home/du/miniconda3/envs/cits5017-2024
The “*” symbol should be on the line for cits5017-2023 as it should be the active environment
after we have activated it.
4. install scikit-learn:
conda install scikit-learn
You may get the version 2024.5.0 installed instead. This version should work fine as well.
After each installation command, you can type conda list to inspect the installed packages and
their version numbers.
6. install the CPU version of tensorflow:
pip install tensorflow-cpu
Note that pip would have been installed from the previous conda install command, so we don’t
need to explicitly install it ourselves. By default, tensorboard would be installed together. For this
unit, tensorflow version 2.12 onward should be sufficient.
7. Next, type:
pip install tensorflow-datasets
8. install transformers:
pip install transformers
9. install gym:
pip install gym
3
10. install jupyter:
conda install notebook
Jupyter-notebook and jupyter-lab provide the interface for editing and running Python notebook
(.ipynb) files. Both are similar and either one is sufficient for the unit. In the command above,
both are installed.
11. install ipywidgets (needed in Chapter 12):
conda install ipywidgets
12. install seaborn (matplotlib and pandas would be automatically installed alongside):
conda install seaborn
The dependencies of seaborn include matplotlib and pandas. Installing seaborn will therefore
install both latter packages which are needed for the unit as well.
13. install openpyxl. This backend library is needed for pandas to read xls and xlsx files.
conda install openpyxl
14. install chardet. This library package seems to be needed for Jupyter-notebook and Jupyter-lab and
needs to be installed explicitly:
pip install chardet
Packages in compressed format were downloaded by each installation command above. The in-
staller extracted them from the zip files and put them in a sub-directory under your home directory
or somewhere else (on macOS, it should be in /opt/miniconda3 if you installed miniconda). Af-
ter installation, these zip files are no longer needed. You will find that you can save a lot of disk
space if you do this cleaning up step. When you are asked to confirm whether a long list of files
ending with .bz2 or .conda should be removed, just type yes.
You can either deactivate the environment or just close the terminal window. Alternatively, while
the cits5017-2024 environment is still activated, you can try running some of the sample Python
code provided by the author of the textbook. Note that as the library packages are installed in the
cits5017-2024 environment, whenever you need to use the installed packages you must activate
the environment first; otherwise, you will see only the packages that come with the installation of
miniconda. After deactivation, you will be back to the default base environment. You should see
the difference by typing conda list here.
NOTE: To open a .ipynb file, always activate the environment in a terminal window, start jupyter-
notebook or jupyter-lab from a suitable directory, and navigate to the sub-directory where the
.ipynb file is and open it. Do not open the .ipynb file by just double-clicking it in your File Explorer
window. If you have made changes to a .ipynb file, then the modification date/time should reflect
that the file has been updated.
Some useful conda commands can be found in the conda cheat sheet.
To install all the packages using the supplied cits5017-2024.yml file, type:
conda env create --file cits5017-2024.yml
Do not use the author’s supplied environment.yml file on the GitHub page https://github.com/ageron/
handson-ml3 as we don’t need all the packages specified there.
4
The process should take only a few minutes to finish. If the installation process failed at any point, you
would need to sort out the compatibility issues among the packages, modify the cits5017-2024.yml file
where needed, remove the broken environment, and recreate a new environment by repeating the conda
command above.
If the installation completed successfully, you should clean up the unwanted compressed files (see the
previous subsection) by typing:
conda clean --all
To activate and deactivate the cits5017-2024 environment, see the previous subsection.
• To find out what type of graphics card you have4 , type: sudo lshw -c display
If you have an NVIDIA GPU, you should see the line Configuration: driver=nvidia latency=0
in the displayed message.
• To find out what version of NVIDIA driver you are using (if you have it already installed), type:
nvidia-smi
• To install the NVIDIA driver version 450.x (for instance), type: sudo apt install nvidia-driver-450
• To find out what version of CUDA you are using (if you have it already installed), type: nvcc
--version
If you get the error message command not found, then check your PATH environment variable. If
you have CUDA installed successfully, the directory /usr/local/cuda/bin should exist.
• To install an appropriate cuDNN library, see the instructions on https://docs.nvidia.com/deeplearning
/cudnn/install-guide/index.html
5
5 FAQs
• What are environments and why do we need them? It is useful to set up different environ-
ments for different projects when you use Python. For example, in project 1, you might need to
use TensorFlow version 1.8.0, but in project 2, you might need TensorFlow 2.0. You can create
an environment named proj1 for project 1 and another environment named proj2 for project 2.
Different packages and different versions of the same packages can be installed in the two envi-
ronments. You should choose environment names that are meaningful and easy to remember. For
example, we use the environment name CITS5017-2024 for all the programming work for the unit.
• What is Jupyter Notebook? Jupyter Notebook is an open-source web application that allows you
to create and share documents that contain live code, equations, visualisations and narrative text.
The output (graphs, plots, etc) can be displayed inside the environment. Jupyter Notebook files
have the extension .ipynb. Markdown (https://en.wikipedia.org/wiki/Markdown), a lightweight
markup language with plain-text-formatting syntax, is supported in Jupyter Notebook.
• What is JupyterLab? It is the next-generation web-based user interface for Project Jupyter. You
can start JupyterLab by typing jupyter-lab in a terminal window. You can use jupyter-lab or
jupyter-notebook or use them interchangeably.