Exploring Entropy

FastAI dev environment in Windows using WSL

WSL Install

In this, I am setting up the dev environment in Windows using their native WSL (Windows Subsystem for Linux) support. In a powershell with administrative privileges, type the command wsl --install This should install the default option Ubuntu linux subsystem. There are other flavors of the linux if required. Those can be installed while running the installation command. Just select the flavor.

FastSetup

For the FastAI development environment, there is a setup guide that is provided my the developers. From the github fastsetup repository From the readme file, just follow the instructions provided for WSL. Note: The instructions talk about mamba installation, and the video courses have MiniForge for conda and mamba installation. Since then, they have changed that to miniconda and the mamba installation step is not required. All required packages can be installed using conda. I also skipped the email setup steps too, wasn't sure what was that for. The Nvidia driver setup steps for WSL from the fastsetup repository worked too. I simply copied and pasted the commands to the terminal and let it do its thing.

Another point is that, I didn't use special environments here. I wanted to dedicatedly use this WSL for ML development purposes. So, didn't think I would need virtual environment to install packages locally. Plus, I wasn't sure what all packages I should install locally or globally. To minimize the friction, I just went ahead with the main WSL environment to install everything. An advice was given by Jeremy Howard regarding the python environment used for this development. He advised to not use the native python env that comes pre installed in linux or mac systems. They are generally used for linux related stuff and are better off left unchanged. So, the fastsetup scripts will also installed a python version with the miniconda. That python version is what we will be using.

Next step is to start installing fastai related packages.

conda install -c fastchan fastai

This should install everything you need like pytorch, numpy, fastai, etc. There is another package that is optionally helpful to get jupyter notebooks running is

conda install -c fastchan nbdev

Now you can get jupyter notebooks by running the commands in the wsl terminal

jupyter-notebook --no-browser

A general advice was to try not to use pip for installation since you would end up installing so many dependencies later. It can usually install the package but at the cost of some gpu related dependencies. So, using conda should help you avoid these.

Next steps, you can simply fire up the jupyter notebooks from the local host link provided when you run it. Git clone the fastbook repository that contains the entire book. All the notebooks are available here, and can be run. Now, I faced some issues when I was trying to run the notebook. When running the first line of the notebook, which usually is "pip install fastbook", the following error would show up. I then just installed fastbook using pip in the terminal. That didn't fix the error, so I looked up online about the missing file.

The error was,

OSError: [WinError 126] The specified module could not be found. Error loading C:\Users\deera\anaconda3\Lib\site-packages\torch\lib\fbgemm.dll" or one of its dependencies. 

It seems like a particular dll file was missing 'fbgemm.dll'. In pytorch github issues, for this error, someone just recommended to install that missing dll file. So I tried to do that. Turns out, powershell has its own package manager for doing the installations called as "Chocolatey". The name is weird but anyway, its not natively installed to I ran the following command to install choco package manager in powershell with admin privileges

Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))

This should install choco, and now the missing dll can be installed using

choco install vcredist140 --version=14.29.30153

This seemed to have fixed my issue. I ran the jupyter notebooks again and all the cells are not getting executed. The jupyter notebook experience can be further improved by adding plugins that are available through nbdevextensions. The following command should install all the available plugins and these can be enabled or disabled from the jupyter notebook webpage directly.

pip install jupyter_contrib_nbextensions

Tricks

  1. In the WSL terminal, it can be tricky to run multiple terminals if you need one to dedicate for the jupyter notebook and then another to run the basic terminal commands. This microsoft page has all the shortcuts needed to run multiple panes in the WSL terminal.
  2. I also tried to run the notebooks from the VScode, there are some extensions that are required to get it to running. Upon opening, it should detect WSL and just ask to install 'WSL-Ubuntu' extension. The extension allows to run the VScode in WSL environment and any new terminal you start, will default to the linux terminal instead of the windows cmd. Some more extensions include 'Jupyter' and 'Python', this enabled Jupyter notebooks to run and detect the python environments that were within the WSL. Now, with these, I don't even have to start jupyter notebooks. I can simply do my developments using the VSCode. In my opinion, I would still prefer to read the fastbook using jupyter and in parallel, redo the chapter projects in a local VScode terminal in a separate repository for my practice. This should include my understandings from the book and I can add my own comments to it. So far, these are the steps I took to get it running on my end. As I encounter more issues, I will keep updating this page.

#tutorial