How to download a full backup copy of Wikipedia

I’ve spent plenty of time making jokes about what would happen if Wikipedia went offline in our modern, internet-dependent world – planes dropping out of the sky, no knowledge of any events before 2007, dogs walking their owners – but in all seriousness, any Wikipedia outage will affect millions of students, educators, scientists, and everyday people looking for answers to both simple and complex questions.

You’re not totally out of luck though; in this article, I’ll show you how to maintain access to Wikipedia’s information even after the site goes offline. Not only will this be useful during deliberate blackouts (like in the January 2012 protest of SOPA and PIPA), but it could come in handy in the future when presented with network difficulties, power outages, or even new internet legislation.

How to download a backup copy of Wikipedia

Before you get started, please note that the standard English backup of Wikipedia is about 7.5 gigabytes. Even on a fast connection, this database can take several hours to download depending on the amount of traffic on Wikipedia’s servers. It is safe to assume that Wikipedia’s servers will be hit with record amounts of traffic if a known blackout is approaching, so if you want to download a copy, start downloading as early as possible.

First off, don’t worry – it is both legal and free to download a backup of all content available on Wikipedia for personal use, mirroring, informal backups, offline use, or database queries. All text content in Wikipedia is licensed under the Creative Commons Attribution-ShareAlike 3.0 License and the GNU Free Documentation License. Images fall under different terms, but in this guide we’re just going to be downloading the text.

While the downloadable version of Wikipedia’s database is massive, there are a few limitations: Only current revisions of articles will be downloaded, and no discussion or user pages are included.

Step 1

Download the English language Wikipedia dump. You can download the latest version of this file directly from Wikipedia or via BitTorrent (unofficial).

You can also download the Simple English Wikipedia, which is much smaller than the full Wikipedia (about 75 megabytes).

Step 2

The Wikipedia database dump is not very useful on its own, so next you’ll need to download the free application WikiTaxi (Windows only) to view Wikipeda on your computer.

(Mac users can check out Wiki Offline for about $10, but in this guide I will only be covering WikiTaxi for Windows.)

WikiTaxi is a “portable” application so you don’t have to install anything. All you need to do is extract the downloaded .zip file and you’re finished.

Step 3

After extracting WikiTaxi and your Wikipedia database download has finished, open the WikiTaxi Importer (WikiTaxi_Importer.exe). Browse to the location of the Wikipedia database you downloaded in Step 1, and then select a location to save the new WikiTaxi-formatted database file. Click Import Now! when finished.

Step 4

Close the WikiTaxi Importer and open the main WikiTaxi application (WikiTaxi.exe). Click the Options button and select Open a *.taxi Database. Locate the database you created in Step 3 and select Open.

That’s it! You now have full, offline access to Wikipedia.

 


Posted

in

, ,

by

Comments

4 responses to “How to download a full backup copy of Wikipedia”

  1. Grouf Pignon Avatar

    Step 1 – Speak something else than English
    Step 2 – Use wikipedia even if it’s gone dark in english.
    Better.

  2. […] The other day I came across a step-by-step guide on how to download a backup copy of Wikipedia. Cool, I guess- if you have the space and time for […]

  3. […] other day I came across a step-by-step guide on how to download a backup copy of Wikipedia. Cool, I guess- if you have the space and time for […]

  4. Darnassus Avatar
    Darnassus

    There’s two different types:
    Wikimedia-logo-circle.svg enwiki-20170401-pages-meta-current.xml.bz2 (25.1 GiB)
    Wikimedia-logo-circle.svg enwiki-20170401-pages-articles.xml.bz2 (12.9 GiB)

    Which one contains what? Is the Meta Current one with pictures and so on? What’s the difference?

Leave a Reply to Remember? | Smile Machine Cancel reply