Backing up a mediawiki wiki
Friday, February 22nd, 2008Have been filling my wiki with lots of things. Since the data is quite important: I cannot afford to loose it, I must arrange some kind of backup scheme for it. I Googled a lot and found many disperse things that, now, I try to put all together in what I think is a complete solution for the problem and quite general and simple.
For this I assume the following:
- ftp.server.of.wiki : is the url of the ftp server where your wiki is stored
- ftp_username : the username to access ftp.server.of.wiki
- ftp_username_password : the password for ftp_username
- remote_path_to_wiki : is the path, from the root of your ftp server (the ftp.server.of.wiki), to your wiki
- local_path_to_wiki_database_backup : is the path to the local directory where you want to store your wiki database backup
- local_path_to_wiki_dir_backup : is the path to the local directory where you want to store you wiki directory backup
- wiki_database_name : the name of your wiki database
- wiki_database_username : the username you configure to access the wiki database
- database_user_password : the password for the user wiki_database_username
- wiki.database.server : the url of the database server of you wiki database
- database_backup_filename : the name of the file to where you want to save the backup of you database
- you are using mysql
Let us do this by steps:
- database backup
- wiki files backup
- automate backup
1- Database backup
To backup the database one just needs to use the command mysqldump. This command is available in windows and in linux, so that you can do this in both systems. Do this by simply typing:
1 | mysqldump -uwiki_database_username -hwiki.database.server -pdatabase_user_password wiki_database_name > database_backup_filename.sql |
If this ran correctly now you should have a file with the whole SQL commands to restore your database.
2- wiki files backup
To backup you wiki files you have many options, one is to use rsync but that is not possible always, as happened to me, where my host did not allow me to use it. So my option was ftp. I searched on the internet and I found lftp which is better than ftp because it really assures you that the transfer occurred and you can use the mirror command. For this we just need to type:
1 | lftp -u ''ftp_username,ftp_username_password" -e "mirror remote_path_to_wiki local_path_to_wiki_dir_backup" ftp.server.of.wiki |
This command will copy everything. You could used instead the following command just just copies the new files:
1 | lftp -u ''ftp_username,ftp_username_password" -e "mirror --only-newer remote_path_to_wiki local_path_to_wiki_dir_backup" ftp.server.of.wiki |
I also got some problems with empty directories, with the v.3.0.6 of lftp. Then installed the latest version, the v.3.6.3 and everything was ok, so make sure you have the last version.
3- Automate backup
Doing this everyime you want to backup is ok, but if you are like me, you forget to do it regularly or you will end up waking in the night thinking: “F#$%k! I haven’t done a backup for 3 weeks…”. So, for this I decided that I must automate the process of backup up. I use cron (or crontab. Let us first set up a file that sets up the cron jobs, that is the commands we want to execute regularly.
I am a python fan so I made a python script that automates both processes I have just described above. I will not explain how it works, just trust that when you run the python script I put next, say it is called wiki_backup.py, by typing python wiki_backup.py it performs the previous operations. So you just need to copy paste this code to a file and name it,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | #! /usr/bin/python ############################################################################### # # Script that makes the backup of a mediawiki wiki located # at a remote host. To do it, it makes a backup of: # 1- database # 2- wiki directory # # INPUTS: # database_username: username of the wiki backup # database_name: name of the wiki database # database_username_password: password for the user database_username # database_server: the url of the database server that hold the wiki database # ftp_server: the url of the ftp server where the wiki is stored # ftp_username: the username to access the ftp server where the wiki is stored # ftp_password: the password for the ftp_user # backup_directory_name: the name of the directory where you want to backup # your wiki. This directory is created in your home dir # wiki_name: the name of your wiki, used to create some of the backup files. # # RESULTS: # Everytime the script is run, a file named: # wiki_name_YYYYMMDD.sql # is created, with YYYY the year, MM the month and DD the day. # This file will be stored in the subdirectory under backup_directory_name # named database_backups. # # Also all the content of the remote dir where your wiki is is stored # inside backup_directory_name in a folder named wiki_files_backup. # This backup is incremental. # ############################################################################### ############################################################################### # # User input data # database_username = "insert_here" database_name = "insert_here" database_username_password = "insert_here" database_server = "insert_here" ftp_server = "insert_here" ftp_username = "insert_here" ftp_password = "insert_here" wiki_remote_dir = "insert_here" backup_directory_name = "insert_here" wiki_name = "insert_here" # ############################################################################### # import the necessary modules # module for running operating system commands import os # module to have time operations like, getting the time import time # create the directory structure where to store all the backup files: # # ---- ~/ # | # |__backup_directory_name # | # |__database_backups # |__wiki_files_backup # # check if backup_directory_name exists, if not, create it # get the home directory. try: #If in WINDOWS USERPROFILE exists home_dir = os.environ["USERPROFILE"] except KeyError: # if an error occurs, trap the error and use # HOME instead home_dir = os.environ["HOME"] # make the root dir for the backups root_dir_backup = os.path.join(home_dir, backup_directory_name) # check if root_dir_backup exists if not os.path.exists(root_dir_backup): # if it does not exist create it os.mkdir(root_dir_backup) # if exists, check if it is a directory elif not os.path.isdir(root_dir_backup): # if it is not, create it os.mkdir(root_dir_backup) # now generate the subdirectories names database_backups_dir = os.path.join(root_dir_backup, "database_backups") wiki_files_backup_dir = os.path.join(root_dir_backup, "wiki_files_backup") # check if the directories exist and create them if not, as before #database_backups_dir if not os.path.exists(database_backups_dir): # if it does not exist create it os.mkdir(database_backups_dir) # if exists, check if it is a directory elif not os.path.isdir(database_backups_dir): # if it is not, create it os.mkdir(database_backups_dir) # wiki_files_backup_dir if not os.path.exists(wiki_files_backup_dir): # if it does not exist create it os.mkdir(wiki_files_backup_dir) # if exists, check if it is a directory elif not os.path.isdir(wiki_files_backup_dir): # if it is not, create it os.mkdir(wiki_files_backup_dir) # now that the directory structure is created, generate the # filename for the database backup, using the current time and the # wiki name: wiki_name_YYYYMMDD.sql # get the local time localtime = time.localtime() database_backup_filename = "%s_%02d%02d%02d.sql" % (wiki_name, localtime[0], localtime[1], localtime[2]) database_backup_filename_complete_path = os.path.join(database_backups_dir, database_backup_filename) # give info to the user print "making backup of wiki database:" print "database: %s" % database_name print "server: %s" % database_server print "username: %s" % database_username # backup the database from the remote server command = "mysqldump -u%s -h%s -p%s %s > %s" % (database_username, database_server, database_username_password, database_name, database_backup_filename_complete_path) os.system(command) # give info to the user print "" print "making backup of wiki files:" print "ftp server: %s" % ftp_server print "remote dir: %s" % wiki_remote_dir print "username: %s" % ftp_username # backup the wiki directory from the remote dir command = "lftp -u \"%s,%s\" -e \"mirror --only-newer %s %s\" %s" % (ftp_username, ftp_password, wiki_remote_dir, wiki_files_backup_dir, ftp_server) os.system(command) |
After, allow that it is executed: doing chmod +x wiki_backup.py on the directory where the file is.
Let us first recall how a cron jobs file is made. Each line of a cron jobs file is a job you want to perform regularly and in that line you specify the regularity and the command to execute, like this:
1 | [min] [hour] [day of month] [month] [day of week] [program to be run] |
where:
- [min]: the minutes at which the program should run. 0-59. Do not set as * or the program will be run once a minute.
- [hour]: the hour at which the program should run. 0-23, * for every hour
- [day of month]: the day of the month at which the program should run. 1-31, * for every day.
- [month]: the month at which the program should run. 1-12, * for every month.
- [day of week]: the day of the week at which the program should run. 0-6 where Sunday=0, Monday=1, …, Saturday=6 and * for every day of the week.
- [program]: the program to be executed. Include full path information.
An example:
1 | 0,15,30,45 * * * * /usr/bin/foo |
where:
To run the program /usr/bin/foo every 15 minutes on every hour, day of the month, month and day of the week. It will run each 15 minutes for as long as the machine is running.Ok, you get the point. For more than you cron job, just put one job at each line of a cron job file.
Let us continue with our mission.
First create the text file with the cron jobs, say cron.wiki. You can do it with any text editor (gedit, kate, emacs, vim, pico, whatever). This file should contain, if you want the backup procedure to run at 03:00 in the morning, everyday of the month, month and day of the week :
0 3 * * * /absolute/path/to/python/script/wiki_backup.py
Now, on a shell add the cron jobs using crontab:
1 | crontab cron.wiki |
Check if it was added correctly:
1 | crontab -l |
If your job appears, then everything is ok and you wiki will be backuped everyday at 03:00, now you can relax!
In the near future I will show how to restore a wiki, from the files we are backing up.