Just as each chapter in this book has a corresponding Jupyter Notebook, each appendix also has a corresponding Jupyter Notebook. All notebooks, regardless of purpose, are maintained in the book's GitHub source code repository. The particular appendix that you are reading here "in print" serves as a special cross-reference to the Jupyter Notebook that provides step-by-step instructions for how to install and configure the book's virtual machine, facilitated using Docker.
You are strongly encouraged to use the book's associated Docker image as a development environment instead of using any existing Python installation you may have on your computer. There are some nontrivial configuration management issues involved in installing the Jupyter Notebook and all of its dependencies for scientific computing. The various other third-party Python packages that are used throughout the book and the need to support users across multiple platforms only exacerbate the complexity that can be involved in getting a basic development environment up and running.
In the somewhat unlikely event that you've somehow stumbled across this notebook outside of its context on GitHub, you can find the full source code repository here.
Stay tuned for a screencast that will walk you through getting started running the source code of the book.
Install the Docker Engine (Community Edition) appropriate to your own operating system or platform
Install a Git client for your operating system
Clone the git repository for Mining the Social Web
git clone git@github.com:mikhailklassen/Mining-the-Social-Web-3rd-Edition.git
Navigate to the directory on your computer where the cloned repository resides. There should be a file called docker-compose.yml
in this folder.
Type docker-compose up
docker-compose up
will build or download the necessary Docker images to run our applications. This may take a while if it is the first time running this command, but it should run quicker in the future.After this finishes, there should be some instructions that appear in the terminal:
mtsw_1 | [C 20:11:25.760 NotebookApp]
mtsw_1 |
mtsw_1 | Copy/paste this URL into your browser when you connect for the first time,
mtsw_1 | to login with a token:
mtsw_1 | http://(9302d900fdc4 or 127.0.0.1):8888/?token=2c14bdc6...
I've truncated the token here. Copy the URL and paste it into your browser, e.g.
http://127.0.0.1:8888/?token=2c14bdc6...
This will navigate to the running application on your local computer. The token exists for secure authentication.
You should now see a web application running Jupyter.
"Docker is an open platform for developing, shipping, and running applications. Docker enables you to separate your applications from your infrastructure so you can deliver software quickly. With Docker, you can manage your infrastructure in the same ways you manage your applications. By taking advantage of Docker’s methodologies for shipping, testing, and deploying code quickly, you can significantly reduce the delay between writing code and running it in production." -- From the Docker Overview
We decided to go with Docker because it allowed us to build a consistent Jupyter Notebook experience for Mining the Social Web, no matter what operating system our readers were using. With Docker installed, it only takes a few commands to get up and running without having to worry about installing any of the dependencies.
The following commands must be run from the command line while in the same directory as this repository's docker-compose.yml
file.
docker-compose build
- Builds the necessary docker images. It make take some time to complete.docker-compose ps
- List containersdocker-compose up
- Builds, (re)creates, starts, and attaches to containers for a service.docker-compose down
- Stops containers and removes containers, networks, volumes, and images created by up
.In the event that you've never used a version control system such as Git to obtain or manage source code, be assured that it's well worth the investment to learn Git fundamentals. The first two chapters of the free Pro Git book by Scott Chacon and Ben Straub are particularly worth the 15 or so minutes that it takes to complete, and you'll also find that Stack Overflow also contains a plethora of answers to common Git questions and best practice guidelines.
The absolute minimum Git skills that you'll want to know for consuming the source code of this book include:
git clone
- With git, you clone a repository to get its source code, and you'll need to https://github.com/mikhailklassen/Mining-the-Social-Web-3rd-Edition
to get source code for this repository. (The repository URL is provided in the right margin of https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition if you missed it.)git status
- You can check the status of your repository by typing git status
in the source code directory that you cloned. A common reason that you'll use git status
is to determine if there are updates in the remote repository that you can pull down.git pull
- Whenever the maintainer of a repository makes an update, you can pull the update by simply typing git pull
in the source code directory that you cloned.git checkout
- You can use git checkout
to checkout a file you may have modified to restore it to its previous state.As you become more comfortable with Git, you may want to fork a Git repository, commit changes to it, and push your changes to the master branch on GitHub. Consult https://git-scm.com/ for more information on how to do these things when you are ready to make that additional leap.
You are certainly able to download a zip archive of a GitHub repository's source code (look for the "Download ZIP" button in the right margin), but doing so would be a bit ironic. This book is all about the social web, and you'd be avoiding the premier social coding platform that hosts its project code. GitHub is inherently social, and there are benefits to participating that you can't gain any other way besides plugging in, being part of the community, and applying some Git fundamentals to contribute from time to time. Forking code, opening pull requests, and otherwise contributing within the boundaries of the GitHub platform tooling is much easier than you might initially think because GitHub delivers such a tremendous user experience. Take a few extra minutes to checkout the source code from GitHub instead of downloading a zip archive. You'll be glad that you took those steps.
Coming soon
Please file tickets here on GitHub if you experience any troubles whatsoever, and thanks again for your interest in Mining the Social Web (3rd Edition). The goal in providing you with a completely turn-key machine experience is so that you can get the most out of the book and its source code -- not to divert your attention into unnecessary system configuration issues. Feedback on ways to improve this experience is always welcome, and pull requests are especially appreciated.
You are free to use or adapt this notebook for any purpose you'd like. However, please respect the Simplified BSD License that governs its use.