Archive for the ‘python’ Category

Backing up a mediawiki wiki

Friday, February 22nd, 2008

Have been filling my wiki with lots of things. Since the data is quite important: I cannot afford to loose it, I must arrange some kind of backup scheme for it. I Googled a lot and found many disperse things that, now, I try to put all together in what I think is a complete solution for the problem and quite general and simple.

For this I assume the following:

  1. ftp.server.of.wiki : is the url of the ftp server where your wiki is stored
  2. ftp_username : the username to access ftp.server.of.wiki
  3. ftp_username_password : the password for ftp_username
  4. remote_path_to_wiki : is the path, from the root of your ftp server (the ftp.server.of.wiki), to your wiki
  5. local_path_to_wiki_database_backup : is the path to the local directory where you want to store your wiki database backup
  6. local_path_to_wiki_dir_backup : is the path to the local directory where you want to store you wiki directory backup
  7. wiki_database_name : the name of your wiki database
  8. wiki_database_username : the username you configure to access the wiki database
  9. database_user_password : the password for the user wiki_database_username
  10. wiki.database.server : the url of the database server of you wiki database
  11. database_backup_filename : the name of the file to where you want to save the backup of you database
  12. you are using mysql

Let us do this by steps:

  1. database backup
  2. wiki files backup
  3. automate backup

1- Database backup

To backup the database one just needs to use the command mysqldump. This command is available in windows and in linux, so that you can do this in both systems. Do this by simply typing:

1
mysqldump -uwiki_database_username -hwiki.database.server -pdatabase_user_password wiki_database_name > database_backup_filename.sql

If this ran correctly now you should have a file with the whole SQL commands to restore your database.

2- wiki files backup

To backup you wiki files you have many options, one is to use rsync but that is not possible always, as happened to me, where my host did not allow me to use it. So my option was ftp. I searched on the internet and I found lftp which is better than ftp because it really assures you that the transfer occurred and you can use the mirror command. For this we just need to type:

1
lftp -u ''ftp_username,ftp_username_password" -e "mirror remote_path_to_wiki local_path_to_wiki_dir_backup" ftp.server.of.wiki

This command will copy everything. You could used instead the following command just just copies the new files:

1
lftp -u ''ftp_username,ftp_username_password" -e "mirror --only-newer remote_path_to_wiki local_path_to_wiki_dir_backup" ftp.server.of.wiki

I also got some problems with empty directories, with the v.3.0.6 of lftp. Then installed the latest version, the v.3.6.3 and everything was ok, so make sure you have the last version.

3- Automate backup

Doing this everyime you want to backup is ok, but if you are like me, you forget to do it regularly or you will end up waking in the night thinking: “F#$%k! I haven’t done a backup for 3 weeks…”. So, for this I decided that I must automate the process of backup up. I use cron (or crontab. Let us first set up a file that sets up the cron jobs, that is the commands we want to execute regularly.

I am a python fan so I made a python script that automates both processes I have just described above. I will not explain how it works, just trust that when you run the python script I put next, say it is called wiki_backup.py, by typing python wiki_backup.py it performs the previous operations. So you just need to copy paste this code to a file and name it, wiki_backup.py and put it wherever you want.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
#! /usr/bin/python
###############################################################################
#
#   Script that makes the backup of a mediawiki wiki located
#   at a remote host. To do it, it makes a backup of:
#      1- database
#      2- wiki directory
#
#   INPUTS:
#       database_username: username of the wiki backup
#       database_name: name of the wiki database
#       database_username_password: password for the user database_username
#       database_server: the url of the database server that hold the wiki database
#       ftp_server: the url of the ftp server where the wiki is stored
#       ftp_username: the username to access the ftp server where the wiki is stored
#       ftp_password: the password for the ftp_user
#       backup_directory_name: the name of the directory where you want to backup
#                              your wiki. This directory is created in your home dir
#       wiki_name: the name of your wiki, used to create some of the backup files.
#
#   RESULTS:
#      Everytime the script is run, a file named:
#         wiki_name_YYYYMMDD.sql
#      is created, with YYYY the year, MM the month and DD the day.
#      This file will be stored in the subdirectory under backup_directory_name
#      named database_backups.
#
#      Also all the content of the remote dir where your wiki is is stored
#      inside backup_directory_name in a folder named wiki_files_backup.
#      This backup is incremental.
#
###############################################################################
 
###############################################################################
#
# User input data
#
database_username = "insert_here"
database_name = "insert_here"
database_username_password = "insert_here"
database_server = "insert_here"
ftp_server = "insert_here"
ftp_username = "insert_here"
ftp_password = "insert_here"
wiki_remote_dir = "insert_here"
backup_directory_name = "insert_here"
wiki_name = "insert_here"
#
###############################################################################
 
# import the necessary modules
 
# module for running operating system commands
import os
# module to have time operations like, getting the time
import time
 
# create the directory structure where to store all the backup files:
#
# ---- ~/
#         |
#         |__backup_directory_name
#                 |
#                 |__database_backups
#                 |__wiki_files_backup
#
 
# check if backup_directory_name exists, if not, create it
 
# get the home directory.
try:
	#If in WINDOWS USERPROFILE exists
	home_dir = os.environ["USERPROFILE"]
except KeyError:
	# if an error occurs, trap the error and use
	# HOME instead
	home_dir = os.environ["HOME"]
 
# make the root dir for the backups
root_dir_backup = os.path.join(home_dir, backup_directory_name)
 
# check if root_dir_backup exists
if not os.path.exists(root_dir_backup):
	# if it does not exist create it
	os.mkdir(root_dir_backup)
# if exists, check if it is a directory
elif not os.path.isdir(root_dir_backup):
	# if it is not, create it
	os.mkdir(root_dir_backup)
 
# now generate the subdirectories names
database_backups_dir = os.path.join(root_dir_backup, "database_backups")
wiki_files_backup_dir = os.path.join(root_dir_backup, "wiki_files_backup")
 
# check if the directories exist and create them if not, as before
#database_backups_dir
if not os.path.exists(database_backups_dir):
	# if it does not exist create it
	os.mkdir(database_backups_dir)
# if exists, check if it is a directory
elif not os.path.isdir(database_backups_dir):
	# if it is not, create it
	os.mkdir(database_backups_dir)
# wiki_files_backup_dir
if not os.path.exists(wiki_files_backup_dir):
	# if it does not exist create it
	os.mkdir(wiki_files_backup_dir)
# if exists, check if it is a directory
elif not os.path.isdir(wiki_files_backup_dir):
	# if it is not, create it
	os.mkdir(wiki_files_backup_dir)
 
# now that the directory structure is created, generate the
# filename for the database backup, using the current time and the
# wiki name: wiki_name_YYYYMMDD.sql
 
# get the local time
localtime = time.localtime()
database_backup_filename = "%s_%02d%02d%02d.sql" % (wiki_name, localtime[0], localtime[1], localtime[2])
database_backup_filename_complete_path = os.path.join(database_backups_dir, database_backup_filename)
 
# give info to the user
print "making backup of wiki database:"
print "database: %s" % database_name
print "server: %s" % database_server
print "username: %s" % database_username
 
# backup the database from the remote server
command = "mysqldump -u%s -h%s -p%s %s > %s" % (database_username, database_server, database_username_password, database_name, database_backup_filename_complete_path)
os.system(command)
 
# give info to the user
print ""
print "making backup of wiki files:"
print "ftp server: %s" % ftp_server
print "remote dir: %s" % wiki_remote_dir
print "username: %s" % ftp_username
 
# backup the wiki directory from the remote dir
command = "lftp -u \"%s,%s\" -e \"mirror --only-newer %s %s\" %s" % (ftp_username, ftp_password, wiki_remote_dir, wiki_files_backup_dir, ftp_server)
os.system(command)

After, allow that it is executed: doing chmod +x wiki_backup.py on the directory where the file is.

Let us first recall how a cron jobs file is made. Each line of a cron jobs file is a job you want to perform regularly and in that line you specify the regularity and the command to execute, like this:

1
[min] [hour] [day of month] [month] [day of week] [program to be run]

where:

  • [min]: the minutes at which the program should run. 0-59. Do not set as * or the program will be run once a minute.
  • [hour]: the hour at which the program should run. 0-23, * for every hour
  • [day of month]: the day of the month at which the program should run. 1-31, * for every day.
  • [month]: the month at which the program should run. 1-12, * for every month.
  • [day of week]: the day of the week at which the program should run. 0-6 where Sunday=0, Monday=1, …, Saturday=6 and * for every day of the week.
  • [program]: the program to be executed. Include full path information.

An example:

1
0,15,30,45 * * * * /usr/bin/foo

where:
To run the program /usr/bin/foo every 15 minutes on every hour, day of the month, month and day of the week. It will run each 15 minutes for as long as the machine is running.Ok, you get the point. For more than you cron job, just put one job at each line of a cron job file.

Let us continue with our mission.

First create the text file with the cron jobs, say cron.wiki. You can do it with any text editor (gedit, kate, emacs, vim, pico, whatever). This file should contain, if you want the backup procedure to run at 03:00 in the morning, everyday of the month, month and day of the week :

0 3 * * * /absolute/path/to/python/script/wiki_backup.py

Now, on a shell add the cron jobs using crontab:

1
crontab cron.wiki

Check if it was added correctly:

1
crontab -l

If your job appears, then everything is ok and you wiki will be backuped everyday at 03:00, now you can relax!

In the near future I will show how to restore a wiki, from the files we are backing up.

del.icio.us Slashdot Digg Technorati Google StumbleUpon