All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
MySQL tuning for low mem
I have Percona XtraDB Cluster running on three nodes, one of which is a 512MB DO VM. Percona provides the cluster gloss on top of MySQL 5.6.9.
Here is memory usage without MySQL running on an otherwise idle 64-bit Debian 8 droplet. I'm running 64-bit because etcd is not certified against 32-bit. (I'm running Debian because Debian thx.)
total used free shared buffers cached Mem: 494 109 385 0 2 52 -/+ buffers/cache: 55 439 Swap: 999 19 980
The swap is because I'd restarted mysql several times in this illustraiton.
After starting MySQL:
total used free shared buffers cached Mem: 494 488 5 0 0 24 -/+ buffers/cache: 464 30 Swap: 999 74 925
Ouch. I am using innodb, but I've tuned innodb_buffer_pool_size down to 32MB and mysql is still sucking up 400+ MB.
After some experimenting, I found that turning off the performance schema in the mysql config makes a mammoth difference.
"performance_schema = 0" in the mysql config makes a mammoth difference...here's the same droplet with MySQL running...only difference is turning off the performance schema:
total used free shared buffers cached Mem: 494 160 333 0 1 51 -/+ buffers/cache: 107 386 Swap: 999 20 979
Holy crap. 350MB just to keep performance stats in memory!?
I think the most important thing the performance schema could tell me in this situation is that the database performs much better with out it.
Comments
A like/thanks given just because you said "droplet".
Memory-wise difference is apparent but is there any change performance-wise?
Interesting observation to say at least!
According to mysql docs performance schema tables are created in memory and never touch disk. So the drop in memory usage seems to make sense.
While docs also say gathering informations and events come with very little performance overhead when actived, they might not have thought of memory sizes in lowend boxes :-) :-)
following...
So RQLite didn't work out? Would be interested to know why.
Interestingly, turned off by default in current versions of MariaDB but on by default in current versions of MySQL.
Performance Schema was a nightmare in the past, it would allocate all the memory it needed at start then hold on to it permanently. It more recent versions its a bit more relaxed but can still cause issues with low memory boxes.
Here's another mammoth difference
@rincewind rqlite seems interesting - but it still has that limitation as sqlite? which is it can not read/write at the same time? (since write locks the database)
I love sqlite but that locking is a kill-off deal for me :-(
@jarland requires all staff to use proper DigitalOcean marketing terms at all times. Have I mentioned how you can support RAM-intensive workloads with High Memory Droplets?
Not driving it hard enough to really tell yet.
Really, I think the Percona XtraDB cluster should come with the performance_schema enabled by default, because most people will be running it on big systems and a few hundred MB of RAM is nothing. But that's interesting about stock MySQL.
This is another interesting bug in 5.6.12: https://bugs.mysql.com/bug.php?id=68287
I noticed I'd used the same phrase twice and was going to Edit but now that would spoil your joke.
More precisely, to quote the SQLite FAQ: "Multiple processes can be doing a SELECT at the same time. But only one process can be making changes to the database at any moment in time, however."
You make a good point that rqlite is ultimately limited by the underlying storage engine.
For those who haven't read about it, rqlite is a RAFT consensus engine that provides consistent, replicated SQLite. RQLite exposes a RESTful interface and handles keeping SQLite in sync between nodes. It works very well and is rock solid in my experience.
However, the underlying storage engine it uses is SQLite, so you have the SQLite limitations (and features, of course). I didn't find this to be a huge issue because I wasn't throwing tons of clients at it, but it could be.
Wow, you have a good memory.
It was an accumulation of inconveniences. Admittedly, I tend to want to use every possible DB features because my background is as a DBA and I like playing with all the shiny buttons, so RQLite's stripped-down web API was not exactly my cup of tea, but I did like RQLite enough to write a golang client for it.
But there were three things that ultimately bugged me enough:
(1) There's transactional support, but only in a single submission. You can bundle 1000 statements together as one transaction, but you have to submit them all at once. You can't submit in a loop and then rollback if you need to abort, etc. Also it makes handling some things like "insert on this table, get last_insert_id, use that over here, and if something goes wrong, back it all out" impossible.
(2) No bind variable support. My concern here was both security (building strings instead of using bind vars is the wrong way to handle SQL) and also quoting arbitrary data.
(3) DB clients (e.g., perl DBD, golang's database/sql, etc.) hand quoting off to the binary client to handle. My golang client couldn't do that, because it can only speak web. I could have embedded my only copy of sqlite and made some C calls to handle quoting, but SQLite doesn't have a quote() call and you can't prepare() without having the schema, which my local sqlite wouldn't have without getting insanely complicated. I figured I'd either have to pull out some graduate-level text on recursive algorithms to implement my own quoting system or forever be plagued with SQL errors and...NO! Man invented bind variables. Let me use bind variables for God's sake.
CockroachDB is another product I looked at. Came out of beta recently.
There's a fourth, completely idiosyncratic reason, which has evolved as I've worked on my madcap project to build a set of three nodes on three continents to serve as job management brains:
(4) In RQLite, all writes go through the master. In Percona, you write local and then it tries to sync with others, returning an error if it gets deadlocked, etc. With RQLite, if you try to talk to a node that isn't the master, it will redirect you to talk to the master.
In my case, I don't want one node to be always active and the others to be idle because I'm paying $5 per. I'd like either all to be active at least pass the hat around frequently.
For processing the job schedule, dispatching jobs, etc. it gets very complicated (app wise) when you try to do active-active-active. You have multiple brains all looking at the same schedule and queues. Just the app logic of "did someone already fire this job" or "I need to lock this queue/is that queue locked/etc." adds a ton of code. So I'm going to have one master at a time, and every so many schedule runs/minutes/some metric, it'll resign as master and let someone else pick up the baton for a while. I'm thinking every 15 minutes or hour or so.
Now here's the rqlite-related design part. How to determine who's master? I can ask rqlite "hey, are you the DB master?" and follow that, but RQLite doesn't provide a way to force a new election, so I'd end up with one master as the master forever. I could run my own etcd service and do a master election that way via atomic compare & swap, but then 2/3s of the time my master would be on one continent while the DB it's talking to is on another. Ugh. I could just force an rqlite restart (which would elect a new master) but that seems kind of crude.
So I'm back to etcd + percona for now. etcd handles the master election and whoever's master writes to their local MySQL which percona syncs with the rest. Every 15 mins or so, the master retires and there's a new election via etcd and that new master starts writing to his local DB.
So far seems to work pretty well.
wow, too much textwall to read for now, but very much appreciate your efforts in sharing you experiences with the community on a not so common topic ;-) Thank you, Sir!
Buy yours today!
Very useful insights.
RQLite's decision for a single bulk transaction, or forcing to go through master are needed for strong consistency guarantees.
If Percona writes locally and then fails to sync, that could possibly put the DB into an inconsistent state. Like if another write happened in the meantime based on the assumption that the previous local change would sync. Just something to keep an eye on...
I didn't understand why you need to force a re-election in a working cluster. Is this to reduce latency? Elections are expensive.
SQLite has file-level reader-writer locks because that's the best that any embedded database can have. A client-server model like MySQL or RQLite can potentially do record-level locking and have higher concurrency. I don't know how locking is currently done in RQLite, though.
Then again, I'm tempted throw it out and run Postgres on top of a 2-node DRBD cluster at DO.
My understanding is that you'd get an error, not an inconsistent cluster. It's an optimistic model - the node you COMMIT to goes to the other nodes and says "here's what I'm planning, is that OK?" and it it doesn't get agreement, the transaction is rolled back locally and you get an error.
It's to...I don't know why. I just hate the thought of starting up the cluster and then whoever wins the election is the master forever and the other nodes just sit there doing nothing...I mean, they're doing something in the sense of providing HA, yes...
If I move the baton around regularly, there's never a doubt it will work if it needs to.
I know, pretty weak arguments...if these were three nodes sitting in the same DC it would be more pointless. At the moment one is in Tokyo, one is in Frankfort, and one is in Las Vegas.
I don't know the exact purpose of performance schema, but found this:
https://dev.mysql.com/doc/refman/5.7/en/performance-schema-memory-model.html
So, when enabled, it reserves all the memory that could possibly be used according to the configuration.
Does this keeps statistics needed for optimizing query execution plan?