Created by Nathan Kelber for JSTOR Labs under Creative Commons CC BY License
For questions/comments/improvements, email nathan.kelber@ithaka.org.
Command Line Skills for Text Analysis#
Description: This notebook is intended to introduce users to the command line. It describes:
Why you should learn the command line
How to install a unix command line on your computer
Various commands for manipulating files
How to install Python libraries on your local machine
Use Case: For Learners (Detailed explanation, not ideal for researchers)
Difficulty: Beginner
Completion Time: 30 minutes
Knowledge Required: None
Knowledge Recommended: None
Data Format: None
Libraries Used: None
Research Pipeline: None
Why learn the command line?#
Most of our everyday experiences with computers involve using a mouse, trackpad, or touchscreen. We use familiar operations like “clicking”, “double-clicking”, or “pinching” on a Graphical User Interface (GUI) to accomplish everyday tasks like browsing the web, reading email, and playing games. Most computer users never use the command line, which foregoes using a cursor, and relies only on text input from the keyboard. Using computers in this way can seem opaque and difficult, so why do folks learn it?
There are a few reasons:
Many important and useful programs have no GUI
The command line helps Python coders start JupyterLab and install Python packages
Certain file operations, like renaming hundreds of files at once, cannot be done in a GUI
The command line is faster for many tasks, requiring a few keystrokes rather than many clicks
Multi-step command line tasks can be scripted, letting users run them multiple times or at particular times
A simple way to connect and interact with internet servers
Installing the Command Line#
Mac OS X#
The command line is already installed by default on Mac OS X machines in a utility named “Terminal.”
Windows 10#
There is a command line installed by default on Windows 10, but we do not recommend learning the command line with it. One of the most significant advantages of learning the command line is the ability to connect and interact with internet servers. The vast majority of servers in the world run on Linux, accepting unix-based commands. By installing a unix-based command line on your Windows machine, you will learn a more popular and portable version of the command line.
Install Git For each setting, choose the default. You may, optionally, wish to change the default editor from Vim (an advanced text editor) to something simpler like Notepad. This is not necessary for our purposes but it may be easier for you later when you need to edit text files.
Opening a Command Line Window#
Mac OS X#
Click on the magnifying glass icon in the upper righthand corner of the screen. (Shortcut: ⌘ + spacebar)
Type
terminal
and then press return.
Windows 10#
Click on the Windows Start Menu in the lower left corner. (Shortcut: Windows key)
Type
git bash
and then press return.
Git Bash vs. Windows Command Prompt
To check which one you are using, look in the upper lefthand corner of the window. Git Bash will say “MINGW64” in the upper lefthand corner of the window. Command Prompt will say “Command Prompt”. This lesson teaches the more widely-used unix-based commands in Git Bash.
Create and edit files#
The nano
editor#
Installation on Windows 10#
We recommend the text editor Nano. It is installed by default on Mac OS X, but it must be installed on Windows 10.
Download nano for Windows
Rename the file to nano.exe
Move it to C:\Program Files\Git\usr\bin
Open Git Bash and enter:
git config –global core.editor “winpty nano”
Opening or creating a file with nano
#
To create or open an existing plaintext file with nano, such as “file.txt”, use nano file.txt
when in the same directory as the file. The nano file editor will open in the terminal window.
Make any desired changes
Press ctrl + x to exit
Type
y
and press enter/return to confirm you want to save changesConfirm (or change) the filename and press enter/return
Move and rename files mv
#
The mv
command allows you to move and/or rename a file. To rename a file without moving it, use the syntax:
mv oldfilename newfilename
when you are in the directory where the file is located.
If you are moving the file
Assuming a file called file.txt, the syntax is:
mv file.txt newfolder/file.txt
assuming there is a newfolder
in the same directory. It is possible to both move and rename a file in a single command, such as:
mv oldfilename.txt newfolder/newfilename.txt
If you need to move a file up a directory, you can supply a full path starting with the root:
mv file.txt /usr/username/folder/file.txt
Or you can use ..
to indicate the file should be moved up a directory:
mv file.txt ../file.txt
Make Directory mkdir
#
The mkdir
command will create a new directory in your present working directory:
mkdir newdirectoryname
Unzip a file unzip
#
The unzip
command will unzip a compressed zip file:
unzip filename.zip
Installing Python packages#
If you need a particular package installed for Python, the default package manager is called pip
. A second package manager called conda
is popular in data science and is often used with the Anaconda distribution of Python that contains a large variety of packages and software suitable for data science work.
For most users, it is suitable to install a package for all users. If you find yourself working on multiple projects at a time with different package, it is a good idea to look into virtual environments. For example, you may work on a project that uses package version .7
and package version 2.0
. Each time you switched projects, you would need to reinstall the proper version of the package for the project you’re working on. Virtual environments allow you to load a different set of packages depending on which project you’re working on.
The Python package manager pip
#
Installing a package with pip is simple:
pip install package-name
You can uninstall a package with:
pip uninstall package-name
The Conda package manager conda
#
The Anaconda distribution of Python comes with a graphical user interface that allows you to install and uninstall packages (and create virtual environments) with a mouse. However, you can also use the command line and the process is very simple.
To install:
conda install package-name
To uninstall:
conda uninstall package-name