Created by Nathan Kelber for JSTOR Labs under Creative Commons CC BY License
For questions/comments/improvements, email nathan.kelber@ithaka.org.


Command Line Skills for Text Analysis#

Description: This notebook is intended to introduce users to the command line. It describes:

  • Why you should learn the command line

  • How to install a unix command line on your computer

  • Various commands for manipulating files

  • How to install Python libraries on your local machine

Use Case: For Learners (Detailed explanation, not ideal for researchers)

Difficulty: Beginner

Completion Time: 30 minutes

Knowledge Required: None

Knowledge Recommended: None

Data Format: None

Libraries Used: None

Research Pipeline: None


Why learn the command line?#

Most of our everyday experiences with computers involve using a mouse, trackpad, or touchscreen. We use familiar operations like “clicking”, “double-clicking”, or “pinching” on a Graphical User Interface (GUI) to accomplish everyday tasks like browsing the web, reading email, and playing games. Most computer users never use the command line, which foregoes using a cursor, and relies only on text input from the keyboard. Using computers in this way can seem opaque and difficult, so why do folks learn it?

There are a few reasons:

  • Many important and useful programs have no GUI

  • The command line helps Python coders start JupyterLab and install Python packages

  • Certain file operations, like renaming hundreds of files at once, cannot be done in a GUI

  • The command line is faster for many tasks, requiring a few keystrokes rather than many clicks

  • Multi-step command line tasks can be scripted, letting users run them multiple times or at particular times

  • A simple way to connect and interact with internet servers

Installing the Command Line#

Mac OS X#

The command line is already installed by default on Mac OS X machines in a utility named “Terminal.”

Windows 10#

There is a command line installed by default on Windows 10, but we do not recommend learning the command line with it. One of the most significant advantages of learning the command line is the ability to connect and interact with internet servers. The vast majority of servers in the world run on Linux, accepting unix-based commands. By installing a unix-based command line on your Windows machine, you will learn a more popular and portable version of the command line.

  1. Download Git Bash

  2. Install Git For each setting, choose the default. You may, optionally, wish to change the default editor from Vim (an advanced text editor) to something simpler like Notepad. This is not necessary for our purposes but it may be easier for you later when you need to edit text files.

Opening a Command Line Window#

Mac OS X#

  1. Click on the magnifying glass icon in the upper righthand corner of the screen. (Shortcut: ⌘ + spacebar)

  2. Type terminal and then press return.

Windows 10#

  1. Click on the Windows Start Menu in the lower left corner. (Shortcut: Windows key)

  2. Type git bash and then press return.

Git Bash vs. Windows Command Prompt

To check which one you are using, look in the upper lefthand corner of the window. Git Bash will say “MINGW64” in the upper lefthand corner of the window. Command Prompt will say “Command Prompt”. This lesson teaches the more widely-used unix-based commands in Git Bash. The visual difference between Windows Command Prompt and Git Bash

Create and edit files#

The nano editor#

Installation on Windows 10#

We recommend the text editor Nano. It is installed by default on Mac OS X, but it must be installed on Windows 10.

  1. Download nano for Windows

  2. Rename the file to nano.exe

  3. Move it to C:\Program Files\Git\usr\bin

  4. Open Git Bash and enter: git config –global core.editor “winpty nano”

Opening or creating a file with nano#

To create or open an existing plaintext file with nano, such as “file.txt”, use nano file.txt when in the same directory as the file. The nano file editor will open in the terminal window.

Using the nano editor

  1. Make any desired changes

  2. Press ctrl + x to exit

  3. Type y and press enter/return to confirm you want to save changes

  4. Confirm (or change) the filename and press enter/return

Move and rename files mv#

The mv command allows you to move and/or rename a file. To rename a file without moving it, use the syntax:

mv oldfilename newfilename

when you are in the directory where the file is located.

If you are moving the file

Assuming a file called file.txt, the syntax is:

mv file.txt newfolder/file.txt

assuming there is a newfolder in the same directory. It is possible to both move and rename a file in a single command, such as:

mv oldfilename.txt newfolder/newfilename.txt

If you need to move a file up a directory, you can supply a full path starting with the root:

mv file.txt /usr/username/folder/file.txt

Or you can use .. to indicate the file should be moved up a directory:

mv file.txt ../file.txt

Make Directory mkdir#

The mkdir command will create a new directory in your present working directory:

mkdir newdirectoryname

Unzip a file unzip#

The unzip command will unzip a compressed zip file: unzip filename.zip

Installing Python packages#

If you need a particular package installed for Python, the default package manager is called pip. A second package manager called conda is popular in data science and is often used with the Anaconda distribution of Python that contains a large variety of packages and software suitable for data science work.

For most users, it is suitable to install a package for all users. If you find yourself working on multiple projects at a time with different package, it is a good idea to look into virtual environments. For example, you may work on a project that uses package version .7 and package version 2.0. Each time you switched projects, you would need to reinstall the proper version of the package for the project you’re working on. Virtual environments allow you to load a different set of packages depending on which project you’re working on.

The Python package manager pip#

Installing a package with pip is simple:

pip install package-name

You can uninstall a package with:

pip uninstall package-name

The Conda package manager conda#

The Anaconda distribution of Python comes with a graphical user interface that allows you to install and uninstall packages (and create virtual environments) with a mouse. However, you can also use the command line and the process is very simple.

To install:

conda install package-name

To uninstall:

conda uninstall package-name