Interview Questions And Answers Set - 6
When and how much can
MySQL replication improve the performance of my system?
MySQL replication is most beneficial for a system with frequent reads and not so
frequent writes. In theory, by using a one master/many slaves setup you can
scale by adding more slaves until you either run out of network bandwidth, or
your update load grows to the point that the master cannot handle it.
In order to determine how many slaves you can get before the added benefits
begin to level out, and how much you can improve performance of your site, you
need to know your query patterns, and empirically (by benchmarking) determine
the relationship between the throughput on reads (reads per second, or max_reads)
and on writes max_writes) on a typical master and a typical slave. The example
below will show you a rather simplified calculation of what you can get with
replication for our imagined system.
Let's say our system load consists of 10% writes and 90% reads, and we have
determined that max_reads = 1200 - 2 * max_writes, or in other words, our system
can do 1200 reads per second with no writes, our average write is twice as slow
as average read, and the relationship is linear. Let us suppose that our master
and slave are of the same capacity, and we have N slaves and 1 master. Then we
have for each server (master or slave):
reads = 1200 - 2 * writes (from bencmarks)
reads = 9* writes / (N + 1) (reads split, but writes go to all servers)
9*writes/(N+1) + 2 * writes = 1200
writes = 1200/(2 + 9/(N+1)
So if N = 0, which means we have no replication, our system can handle 1200/11,
about 109 writes per second (which means we will have 9 times as many reads due
to the nature of our application).
If N = 1, we can get up to 184 writes per second.
If N = 8, we get up to 400.
If N = 17, 480 writes.
Eventually as N approaches infinity (and our budget negative infinity), we can
get very close to 600 writes per second, increasing system throughput about 5.5
times. However, with only 8 servers, we increased it almost 4 times already.
Note that our computations assumed infinite network bandwidth, and neglected
several other factors that could turn out to be signficant on your system. In
many cases, you may not be able to make a computation similar to the one above
that will accurately predict what will happen on your system if you add N
replication slaves. However, answering the following questions should help you
decided whether and how much, if at all, the replication will improve the
performance of your system:
What is the read/write ratio on your system?
How much more write load can one server handle if you reduce the reads?
How many slaves do you have bandwidth for on your network?
How can I use replication
to provide redundancy/high availability?
With the currently available features, you would have to set up a master and a
slave (or several slaves), and write a script that will monitor the master to
see if it is up, and instruct your applications and the slaves of the master
change in case of failure. Some suggestions:
To tell a slave to change the master use the CHANGE MASTER TO command.
A good way to keep your applications informed where the master is by having a
dynamic DNS entry for the master. With bind you can use nsupdate to dynamically
update your DNS.
You should run your slaves with the log-bin option and without
log-slave-updates. This way the slave will be ready to become a master as soon
as you issue STOP SLAVE; RESET MASTER, and CHANGE MASTER TO on the other slaves.
It will also help you catch spurious updates that may happen because of
misconfiguration of the slave (ideally, you want to configure access rights so
that no client can update the slave, except for the slave thread) combined with
the bugs in your client programs (they should never update the slave directly).
We are currently working on intergrating an automatic master election system
into MySQL, but until it is ready, you will have to create your own monitoring
How MySQL Opens and
Closes Tables ?
table_cache, max_connections, and max_tmp_tables affect the maximum
number of files the server keeps open. If you increase one or both of these
values, you may run up against a limit imposed by your operating system on the
per-process number of open file descriptors. However, you can increase the limit
on many systems. Consult your OS documentation to find out how to do this,
because the method for changing the limit varies widely from system to system.
table_cache is related to max_connections. For example, for 200 concurrent
running connections, you should have a table cache of at least 200 * n, where n
is the maximum number of tables in a join.
The cache of open tables can grow to a maximum of table_cache (default 64; this
can be changed with the -O table_cache=# option to mysqld). A table is never
closed, except when the cache is full and another thread tries to open a table
or if you use mysqladmin refresh or mysqladmin flush-tables.
When the table cache fills up, the server uses the following procedure to locate
a cache entry to use:
Tables that are not currently in use are released, in least-recently-used order.
If the cache is full and no tables can be released, but a new table needs to be
opened, the cache is temporarily extended as necessary.
If the cache is in a temporarily-extended state and a table goes from in-use to
not-in-use state, the table is closed and released from the cache.
A table is opened for each concurrent access. This means that if you have two
threads accessing the same table or access the table twice in the same query
(with AS) the table needs to be opened twice. The first open of any table takes
two file descriptors; each additional use of the table takes only one file
descriptor. The extra descriptor for the first open is used for the index file;
this descriptor is shared among all threads.
You can check if your table cache is too small by checking the mysqld variable
opened_tables. If this is quite big, even if you haven't done a lot of FLUSH
TABLES, you should increase your table cache.
MySQL - Drawbacks to Creating
Large Numbers of Tables in the Same Database
If you have many files in a directory, open, close, and create operations will
be slow. If you execute SELECT statements on many different tables, there will
be a little overhead when the table cache is full, because for every table that
has to be opened, another must be closed. You can reduce this overhead by making
the table cache larger.
MySQL - Why So Many Open
When you run mysqladmin status, you'll see something like this:
Uptime: 426 Running threads: 1 Questions: 11082 Reloads: 1 Open tables: 12
This can be somewhat perplexing if you only have 6 tables.
MySQL is multithreaded, so it may have many queries on the same table
simultaneously. To minimize the problem with two threads having different states
on the same file, the table is opened independently by each concurrent thread.
This takes some memory and one extra file descriptor for the data file. The
index file descriptor is shared between all threads.