|
|
Subscribe / Log in / New account

New releases from MySQL descendants Drizzle and MariaDB

October 20, 2010

This article was contributed by Nathan Willis

For years, MySQL has been the highest-profile open source relational database system, but with the Sun (and, later, Oracle) acquisition of MySQL's corporate parent MySQL AB, the development community has split in several directions. Now, a few years later, both of the leading community-driven forks of MySQL, Drizzle and MariaDB, have made important new releases. Drizzle, the light-and-lean database system designed for web and cloud applications, unveiled its first beta release — complete with MySQL migration tools — and MariaDB, the full-featured database system positioned as a direct competitor to MySQL, made a "gamma" release, and picked up an important endorsement.

Drizzle beta

Drizzle build 1802 was released at the end of September, and was dubbed the "Drizzle7 beta release" in the accompanying announcement. In addition to the usual assortment of speed-ups, bug fixes, and new options, three major features grabbed the headlines. One is the introduction of Sphinx-based documentation. Sphinx is a documentation system based around the reStructuredText markup format, which is intended to make it easier to integrate application documentation inline within the source code itself. Indeed, Drizzle is taking advantage of this feature, storing its documentation in its source tree.

Of more importance to database users, however, are two features that simplify the transition from MySQL to Drizzle. First, the drizzledump backup and restore utility now has the ability to detect when it is run against a MySQL database, and export a dump of the database in a Drizzle-compatible format. For the slightly more daring, it can also dump the MySQL database and import the data and structures directly into a Drizzle database in a single command. Either way, it eliminates the need to run a costly conversion between the two applications.

Second, Drizzle can now speak MySQL's native TCP/IP protocol. By default, Drizzle uses the same TCP port reserved for MySQL, 3306. Future plans are to develop a separate Drizzle protocol running on TCP port 4427, but for the time being, the ability to use MySQL's network protocol has the effect of making it much easier to port applications written for MySQL over to Drizzle.

This feature includes the network protocol only; Drizzle does not support Unix sockets as a connection method, which is part of the project's stripped-down philosophy. Drizzle was started by former MySQL architect Brian Aker in 2007 as a response to what he felt was MySQL's increasing focus solely on enterprise applications, abandoning many of the project's original constituents, web application developers. Aker has said on several occasions that he wants to develop Drizzle as a community project, in contrast to the final days of his involvement with MySQL, when virtually all of the MySQL developers were employed working on the project full-time, and patches from outsiders dropped to virtually zero.

Drizzle adopted a smaller, faster "microkernel" architecture, stripping out advanced functionality such as views, triggers, and query caching, while pushing many of the remaining functions (such as logging or authentication) into pluggable modules. In several places, it simplifies MySQL's multi-faceted design, such as offering only one type of binary blob, specifying UTF-8 as the text format, and UTC as the only timestamp "time zone" format. The result is a faster database management system around one-third the size of MySQL, and one that Aker hopes will be easier for new developers to understand and contribute to.

The project allows all contributors to retain their own copyright on their contributions. The project uses Bazaar as its source code management system, making incremental releases every two weeks. Aker stated in 2009 that there were more than 100 contributors to the project, which is roughly in line with the size of the Drizzle-developers team on Launchpad.net. According to Launchpad, there are 325 active branches under development, owned by 66 developers or teams. Also telling is that the developers hail from a variety of different employers, including Canonical, Google, Oracle/Sun, and Rackspace.

In addition to the community focus, Drizzle is optimized for "cloud" and web application usage in a number of ways. As mentioned earlier, it provides TCP/IP as its only connection method. It is also optimized for 64-bit processors and "massive concurrency" over multi-core and multi-CPU machines, including sharding across multiple nodes. Finally, it is built for Unix-like servers only (offering no Windows version), and supports external stored procedures in scripting languages like Ruby, Perl, and PHP.

MariaDB gamma

While Drizzle is an attempt to hone the MySQL code base into a lean-and-mean database manager, MariaDB takes nearly the opposite approach, building a system with an array of high-end options well suited for enterprise usage. MariaDB was started by MySQL creator Michael "Monty" Widenius in 2009, with the goal of developing a community-driven project that could serve as a drop-in replacement for the official MySQL.

The 5.2.2-gamma release of MariaDB was also announced at the end of September, and is described as a "release candidate" marking the end of the 5.2 development cycle. The list of new features is tellingly longer than that of Drizzle's, including a reworked version of the default InnoDB storage engine and two entirely new storage engines: OQGRAPH, which is designed for storing tree-like structures and complex graphs, and Sphinx, a text-oriented storage engine (this Sphinx bears no relation to the Sphinx documentation system used in Drizzle; chalk it up squarely to coincidence).

Also new is support for virtual columns (fields containing expressions that are evaluated upon retrieval), segmented key caches for the MyISAM engine (which allow multiple threads to fetch keys simultaneously without locking the entire cache), the ability to CREATE tables with storage-engine-specific attributes, an extended user statistics system, and pluggable authentication.

MariaDB 5.2.2 is based on the MySQL 5.1.50 source code, but several of the new features mentioned above — such as extended user statistics and segmented key caching — come from other sources. On top of that, some of the new functionality is still in development for Oracle's MySQL, including virtual columns. The official MySQL's pluggable authentication system is available only to Oracle customers with commercial support contracts. MariaDB comes with an authentication module that allows the system to use existing MySQL user accounts, thus easing the transition between the two products.

On the whole, however, MariaDB aims at compatibility with MySQL. Widenius's new venture is partly an attempt to rebuild the community-based development approach that MySQL enjoyed in its early days, and partly an attempt to build a different business model around database development. Unlike MySQL AB, which was sold to Sun in a deal that he engineered, Widenius describes his new business Monty Program AB as a "hacker business model" where revenue from its support contracts go directly back into maintaining the code. He also founded the non-profit Open Database Alliance with other MySQL service providers to attract various independent support providers and database resellers.

Like Drizzle, the MariaDB source code is hosted at Launchpad.net. In contrast, however, contributors must sign a contributor agreement that assigns joint ownership of the contribution to Monty Program AB. Monty Program AB employees also review all patches and contributions and approve membership in the Maria-captains team that has commit rights. According to Launchpad, there are 21 Maria-captains members, and 149 in the larger Maria-developers group, all working on 62 active branches.

Despite intentionally following the official MySQL development series, MariaDB has started to attract attention on its own. Last week, a number of former MySQL executives launched SkySQL, a database support company competing head-to-head with Oracle's services — including support for MariaDB alongside support for MySQL. SkySQL executive Kaj Arno told InternetNews.com "If you are a MySQL customer and your bug is fixed in MariaDB, I think it might make sense to move," though he added that encouraging customers to migrate was not the company's goal.

Lessons learned?

To say that Oracle's acquisition of Sun and the open source projects it stewarded has been poorly-received by the community would be quite the understatement. This month, the big debate is over OpenOffice.org (OOo) fork LibreOffice, and Oracle has taken a hard line: renewing its public commitment to OOo and threatening to excommunicate OOo community council members who do not distance themselves from the new project.

Looking at how well the MySQL forks have matured, however, it does not look like LibreOffice supporters have too much to fear. Drizzle and MariaDB are both prepared to help any interested MySQL users migrate away from the platform. Drizzle is taking shape as a fast and light replacement for the large market segment of customers whose MySQL database is primarily designed to serve as the back-end of a web application, while MariaDB is actually ahead of MySQL on supporting high-end features for enterprise customers. In either case, MySQL may not enjoy its current position as the default database of choice for much longer.


Index entries for this article
GuestArticlesWillis, Nathan


to post comments

New releases from MySQL descendants Drizzle and MariaDB

Posted Oct 21, 2010 11:44 UTC (Thu) by nix (subscriber, #2304) [Link] (3 responses)

I'm stunned that Drizzle has dropped query caching. Perhaps this is to keep cache locking down, or something, but still, cloud apps tend to run the same queries over and over again, many times, on many nodes. Surely a cache for their parsed representations is something you should provide early on, not strip out!

New releases from MySQL descendants Drizzle and MariaDB

Posted Oct 21, 2010 11:59 UTC (Thu) by jbh (guest, #494) [Link] (1 responses)

Replace the built-in cache with a plugin, I think. See for example
http://dedcode.wordpress.com/2010/08/16/memcached-query-c...

New releases from MySQL descendants Drizzle and MariaDB

Posted Oct 22, 2010 9:07 UTC (Fri) by nix (subscriber, #2304) [Link]

Ah, now *that* makes sense.

New releases from MySQL descendants Drizzle and MariaDB

Posted Oct 22, 2010 0:40 UTC (Fri) by Velmont (guest, #46433) [Link]

I agree. I turned off query caching by a mistake just some days ago, that hit the web server *really* hard! I was wondering why it was running so slow before I checked the munin graphs and saw that no queries were cached (and I could see the load building up).

If they do use a plugin-method, well, they better have an easy plugin that's plugnplay, or it won't be very useful for all the uses I'd use it for.

Memcached is nice and all, but most web sites still don't need that.

New releases from MySQL descendants Drizzle and MariaDB

Posted Oct 24, 2010 23:47 UTC (Sun) by jengelh (subscriber, #33263) [Link] (6 responses)

I'll wait and see, and maybe - start using the MySQL offspring that first starts to use Git, that's what I thought when reading about bzr ;-)

The removal of PF_LOCAL socket support I see as a showstopper. TCP sockets are scarce in number, well, I did manage with mysqlslap to use them all up at one point.

New releases from MySQL descendants Drizzle and MariaDB

Posted Oct 25, 2010 6:53 UTC (Mon) by nix (subscriber, #2304) [Link] (5 responses)

While normally I'd scoff at anyone choosing against a program because of the VCS it is stored in, I can completely understand it if the VCS is bzr. I have never encountered a version-control system slower at cloning new checkouts, nor one more prone to endless loops of network fetching if anything goes wrong with the checkout. (Last month I cloned Emacs. I waited five hours for the bzr checkout, gave up, and checked it out from a git mirror instead. The git clone took less than ten minutes. Remind me again why anyone uses bzr for anything?)

New releases from MySQL descendants Drizzle and MariaDB

Posted Oct 25, 2010 7:40 UTC (Mon) by anselm (subscriber, #2796) [Link]

With Emacs, it's political. Bazaar is the only DVCS that is also an official GNU project. See this LWN article.

New releases from MySQL descendants Drizzle and MariaDB

Posted Oct 25, 2010 7:43 UTC (Mon) by jbh (guest, #494) [Link] (1 responses)

And then, to add insult to injury, they fixed the problem by adding a bzr notification daemon to alert you when the operation is finished. So that you can do other things while you wait, natch. See comment #1 of https://bugs.launchpad.net/ubuntu/+source/bzr-gtk/+bug/34....

(Normally I wouldn't care, but the damn thing was turned on by default when I installed the bzr graphical tools. Since then, I have avoided it whenever possible.)

New releases from MySQL descendants Drizzle and MariaDB

Posted Oct 25, 2010 19:30 UTC (Mon) by nix (subscriber, #2304) [Link]

That is truly ridiculous. A whole notification daemon, running all the time and reporting on the completion of cheap operations as well as expensive ones, to cover something that could be equally handled by *ringing a bell* when the expensive operation completes?! Talk about utterly pointless bloat.

(Not that git or hg bother to ring bells when expensive operations complete, because even massive tree clones in both of these tools take a fraction of the time they take in bzr.)

New releases from MySQL descendants Drizzle and MariaDB

Posted Oct 27, 2010 17:32 UTC (Wed) by brian@tangent.org (guest, #70860) [Link]

Hi!

Thanks for your comment about unix socket support. I went on and added a plugin last night so that Drizzle will handle the MySQL UNIX Socket protocol. It will be in the next beta.

As far VCS goes, my personal take on this is that they are all the same (assuming that they are distributed!). Our choice for BZR is more about Launchpad then the tool itself. LP is a wonderful sit to use and github is just a little too minimalistic. We have also found that users tend to pick up BZR more quickly then GIT, and that GIT support on Windows is really subpar. If someone finds that the feel strongly enough about GIT, I am sure that a mirror could be run.

Cheers,
-Brian

New releases from MySQL descendants Drizzle and MariaDB

Posted Oct 31, 2010 13:08 UTC (Sun) by fjalvingh (guest, #4803) [Link]

Having a branch take 5 hours is horrible, of course, but I wonder if this is standard or an exception? I just branched emacs trunk and got it in 19 minutes. Git is faster, but for an exceptional case (getting a fully new and quite big branch and not yet having a shared repository with parts of it present) I do not think it's unreasonably slow.

I do know some reasons why bazaar is used though, and they have nothing to do with political reasons:

1. The Bazaar user interface (meaning it's command set, and the handling of commands) is way, WAY better than Git's or Mercurial's. It's command set is well thought out; commands do NOT do unexpected things AND do only ONE thing. There are no options that change behavior in unexpected ways.

Commands have good default behavior and are very easy to explain to developers that are not that interested in DVCS' operation.
For people that are used to learn their tools fully and understand every nook and cranny of them this is not important, but it is very important for the "I'm just using it" people. It has a small learning curve, and things do not go wrong often because people quickly learn what's needed- and do not have to learn many things because the tool does lots of things right automatically.

2. Bazaar has first-class rename support. Whether this is important depends on what you're developing but in some cases it can be very important. In our case we're using Java and we support multiple versions of the same product. When you refactor Java code the .java source files move all over the directory tree. Bazaar's rename support allows us to refactor our code in newer versions while maintaining the ability to merge fixes and changes from older versions into that refactored code- with remarkably little (usually zero) conflicts and problems.
Not having this means we would effectively be banned from doing any kind of refactoring at all, because we would get merge conflicts all over when merging fixes from older versions upwards.

Bazaar is slower than some other tools on some actions, but most of it is hardly noticeable once you start using things like shared repositories. All base actions are quite fast- at least fast enough not to be noticeably slow. Having a commit finish in 0.02 or 2 seconds is actually not important, even though one is 100x faster.

My biggest "gripe" with bazaar is that it's childs play to functionally (and sometimes technically also) destroy a repository. Some bazaar versions have horrible bugs that will destroy your repository if they are used on it. I've lost all history of work on our main repo's *twice* because of that!!!! And till this day there *still* is NO way to define a repository as "usable by versions x, x and x only". Since distro's like Ubuntu change quickly and so multiple versions of bazaar easily proliferate also it is kind of a timebomb...

Likewise support for things like proper line endings (CR, LF, CRLF) are there in theory, but it's definition is not part of the repository but must be maintained on every user's workstation!?!?! Any user forgetting to configure this properly will slowly destroy the repository with every commit he does- without any way of checking. I do not understand why someone would create such a stupid way of configuring things, but I am flabbergasted that it is not fixed!

A similar problem is related to plugins. Because there is no central "gatekeeper" in bazaar it is often important to have commit-time checks executing /before/ sh*t enters the repository. We for instance scan for proper file encoding (Only UTF-8 where that is allowed; only iso-8859-15 at other places etc), proper source structure (no whitespace at line ends, tab/space usage etc). These kind of checks are often needed because badly configured IDE's can easily save a fully-changed version of a file even though only 1 line was changed- for instance because leading spaces are translated to tabs or v.v.

Bazaar has no way to tell //to a repository// that changes are not allowed unless version x.xx of a plugin is present. So people can forget to install it and commit problems until discovered. Idiocy.

These things alone mean that for us bazaar is fragile, and the repository needs lots of maintenance (several times a year) when someone forgot to follow company guidelines for tool configuration. It's the prime reason why I closely monitor progress on other DCVS's. But the ease of use and the rename support are too important to leave bzr right now.


Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds