====== Windows Subsystem for Linux ====== ===== Why install WSL ==== Linux is an operating system like Windows and MacOS, and is widely used in scientific computing, and the quasi standard in software development and the server world. It is what is powering our FU Servers and INLET. It is [[https://en.wikipedia.org/wiki/Free_and_open-source_software|Free and Open Source]]. The [[https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux|Windows Subsystem for Linux 2]] (WSL2) allows you to run a Linux command line within Windows. It comes without a graphical interface, but provides a good alternative to [[https://en.wikipedia.org/wiki/Multi-booting|dual booting]] or switching to Linux entirely. In WSL2, you have easy access to text processing tools like ''grep'', ''sed'', and ''awk'', scripting languages like Bash, Python and Perl, you can install CWB+CQP, use the TreeTagger, and much more. In other words, you have the full power to set up your own corpus lab right on your own computer. ===== Install WSL2 on Windows 10 ===== Requirements: 800 MB of free space - Activate Features - Set default version to WSL2 - Install a distribution - Update === 1. Activate Features === You need to access the settings window //Turn Windows Features on or off//. Do this by either typing “features” into the search bar (this works with both English and German language settings) and selecting //Turn Windows Features on or off// or by bringing it up via the control panel/ Systemsteuerung (//Control Panel// > //Programs// > //Turn Windows Features on or off//). You should see a list of available features with boxes that you can tick. Tick the boxes next to //Windows Subsystem for Linux//. If you see //Virtual Machine Platform// in the list, activate it as well. Submit the changes by clicking //OK// below. You will have to reboot your computer. === 2. Set default version to WSL2 === Currently, there are two versions of WSL, and we recommend the 2nd. Open //Windows PowerShell// (find it by right-clicking on your start button or typing it in the start menu). Run the following command. wsl --set-default-version 2 You may be asked to download the latest version of the kernel. Copy and paste the link in the response into your browser (it may look like this: https://aka.ms/wsl2kernel), download the package and follow the quick installation procedure. === 3. Install a distribution === Open the //Microsoft Store// on your computer. Type “linux” into the search bar, and it should pull up a list of the available distributions. We recommend installing //Debian//, which is the Linux distribution used on the university server. Complete the download process and launch the app. It will open a terminal and start an installation process. At the end, you will be asked to create a default UNIX account which requires a username and a password. **Note that this account does not affect your Windows settings and is solely confined to this Linux distribution, hence you will only need the password within WSL.** === 4. Update === Once the set-up is complete, it is advised to update by running the following command, i.e. typing or copying them into the command line and pressing enter to run it: sudo apt update && sudo apt dist-upgrade This might take a few minutes. Once the commands have finished, the installation process is complete. ===== Using WSL2: First steps ===== There are two main ways to run WSL. The first option is to open Debian via the Start Menu, which will open the same terminal seen during installation. It will put you inside Linux and in your Linux home folder ''~/'' which is short for ''/home/USERNAME/''. In order to get to the folder where all your personal files are located, you can use the command ''cd'' "change directory". Your Windows Home Folder is located at ''/mnt/c/Users/YourName/''. For example, to get to your files on the Desktop, enter the following: cd /mnt/c/Users/YourName/Desktop You can use the list command ''ls'', to show you the files in the active directory. ls Alternatively, you can use //Windows PowerShell// and type ''wsl''. There is also //Windows Terminal//, which you can install from the Store, and which provides features like tabs and more customization options. The prompt will change to a Linux prompt that ends with a ''$''. This puts you directly in your Windows Homefolder. To get to the Linux home folder, just enter ''cd'' without any arguments. Once you are in WSL you will be in a Shell, which is an environment that executes the commands you enter. The default linux shell is called [[http://mywiki.wooledge.org/BashGuide|Bash]]. What you see in your terminal is a prompt which ends in ''$'', and displays which directory you are in. ===== Using SSH in WSL ===== To access another server with the command ''ssh'', for example the university server on which you use cqp, you need to install openSSH. Launch WSL and install open-ssh with the following command: sudo apt install openssh-server Once the installation is finished, you should be able to connect to other servers via ssh the same way you may have done in //Windows PowerShell//. ===== Setting up the IMS Open Corpus Workbench on WSL ===== Once you have WSL up and running, you can install CWB and use CQP in a few steps. This allows you to install, compile, store and search corpora locally on your own computer. - Download the [[http://cwb.sourceforge.net/download.php#cwbDownload|package]] and unpack - Run the installation script - Add to the $PATH - Create the registry folder - Alias ''cqp'' as ''cqp -e'' === 1. Download and unpack === At the time of writing, the Corpus Workbench (CWB) can only be compiled from source code which is available [[http://cwb.sourceforge.net/download.php#cwb|here]] under the header **CWB main package**. The CWB main package comes as a tarball (.tar.gz). You need to figure out the directory where the file is located. The commands below assume that it is in the //Downloads// folder. If it is not there, change the Path accordingly. Also change ''YourName'' to the name of your Windows Account (the one you see on login). Use the '''' key for autocompletion. cd /mnt/c/Users/YourName/Downloads/ Confirm with ''ls'' that the file is there and unpack it. tar xvzf cwb-3.4.22.tar.gz ((You can specify a different directory for the unpacked folder by using the ''-C'' option, e.g. ''tar xvzf cwb-3.4.22.tar.gz -C ~/'' to put it in the Linux home folder. See [[https://www.tutorialspoint.com/unix_commands/tar.htm|here]] for more information on ''tar'')) === 2. Run the installation script === You can now list the files in the new directory with ''ls cwb-3.4.22''. In there, you should see various files and directories. One of them is called INSTALL, in which the manual installation procedure is explained, in case you are interested or have a different setup. If you don't, just run the install script: sudo ./cwb-3.4.22/install-scripts/install-linux You will have to enter the WSL password that you have set during installation. The installation might take a few minutes to complete. In theory, CWB and CQP should be ready to use. However, there may be a few adjustments to be made. To test if your shell finds the newly installed commands like ''cqp'', run: cqp If cqp launches as you know from the university server, you can skip the next section. Note that the command ''cqp'' alone does not allow you to use the arrow keys and requires you to end every command with a semicolon (e.g. type ''exit;'' to stop cqp). To use cqp as usual, launch it like this: cqp -e === 3. Add to the $PATH === If cqp does not start, i.e., if the shell returns the message that the command cannot be found, you need to manually set the shell to look in the location of the installated files. To do that, you have to permanently add the path to the installed files to the PATH variable in the Linux system. The PATH variable is a list of directories that your Linux system automatically searches for commands. The CWB files are installed in this location: ''/usr/local/cwb-3.4.22/bin/''. You can check by listing the contents of the folder. If it exists and contains files like ''cqp'' or ''cqpcl'', it is right the place.: ls /usr/local/cwb-3.4.22/bin/ In order to add the the cqp tools to the path, you need to edit the configuration file ''.bashrc''. This file is read at the start of every terminal session and sets the settings for it. To edit the file, we will use //nano//, which is installed by default.((If bash can't find the command ''nano'', install it by running ''sudo apt install nano'')) ''.bashrc'' is located in your Linux home directory. nano ~/.bashrc You navigate the file by using the arrow keys. Move to the bottom of the file and copy and paste the following line there:((If your binary files are located in a different place, use that path following this schema: ''export PATH=$PATH:/Your/Path/''.)) export PATH=$PATH:/usr/local/cwb-3.4.22/bin/ Save your changes pressing Ctrl+s (Strg+s) and exit with Ctrl+x (Strg+x). Now the file needs to be read again for the changes to take effect. You do this with the ''source'' command: source ~/.bashrc If you now try to launch cqp (or better ''cqp -e''), it should work as usual, prompting you to choose a corpus. A message will alert you to a missing registry folder, which the next section is about. === 4. Create the Registry Folder === If you want to add an already existing corpus to be used on your own computer with your CWB installation, you will need the corpus files and its registry file. The CWB installation will create and look for registry files in a specific place and, by default it is a folder called //registry// in the following place: ''/usr/local/cwb-3.4.22/share/cwb/registry''. You can create it with the ''mkdir'' (make directory) command. Use it as follows:((A tip: To avoid typing the full name of every folder, type the first or the first few unique letters and press TAB. It autocompletes the name of the folder.)) sudo mkdir -p /usr/local/cwb-3.4.22/share/cwb/registry Now if you start cqp, the warning message should be gone. === 5. Alias ''cqp'' as ''cqp -e'' ==== In order to execute ''cqp -e'' by default when typing just ''cqp'', like on the university server, you may want to set an //alias//. This can also be done in the ''.bashrc''. Simply open the file again: sudo nano ~/.bashrc And add the following to the end. alias cqp='cqp -e' Now the file needs to be read again with ''source ~/.bashrc''. Now if you start cqp using the command ''cqp'', it should work exactly the way you know from the university server, as though you had typed ''cqp -e''. ===== Adding an Existing Corpus ===== As said before, to add an existing corpus, you need the corpus files and a registry file. Place the registry file, bearing the name of the corpus, in the registry folder you have created in a previous step. Here, we will it do it using the command ''cp'' (copy). Move into the folder that contains the registry file with ''cd'' and copy the file into the registry folder like this, replacing "registryfile" with the actual name of the registry file, which is the same as your corpus: sudo cp registryfile /usr/local/cwb-3.4.22/share/cwb/registry The registry file requires you to specify the path to your corpus files. For example, if the corpus files were located in a folder called //corpusfiles// on your Desktop, the path would look like this ''/mnt/c/Users/YourName/Desktop/corpusfiles''. Remember the path to the corpus files and open the registry file: sudo nano /usr/local/cwb-3.4.22/share/cwb/registry/registryfile In the registry file, you should see the comment ''# path to binary data files'', underneath which it says ''HOME'', followed by a path. Delete the path, as it may not be the right one for you and add your path instead. In the end, it may look like this: ''HOME /mnt/c/Users/YourName/Desktop/corpusfiles''. Now that the registry file is in the right place, and it contains the correct path to your corpus files, you can try to access the corpus with ''cqp''. If one of the available corpora is the one you just added and you can query it, it worked!