Category: Article

From MySQL 8.0.0 to MySQL 8.0.1 – or any other dev milestone

Disclaimer: This post is aimed to you, the curious developer, sys-admin, technologist, whatever-title-you-use. DO NOT run the following lines on production. Not even in a stable environment, do this if you don’t care about the outcome of the current data.

If you want to keep up with the newest MySQL developer milestones I have news for you: there is no upgrade available for milestone versions. The way to go is to remove old version and install new one, according to their website:

Upgrades between milestone releases (or from a milestone release to a GA release) are not supported. For example, upgrading from 8.0.0 to 8.0.1 is not supported, as neither are GA status releases.

So if you, like me, had the 8.0.0 version and want to test the 8.0.1 (alhtough 8.0.3 milestone is already in development) you need to do something like the following (tutorial based on Debian/Ubuntu servers).

Stop your service:

$ sudo service mysql stop

Download Oracle’s repository and install it, as of now this is the current version, you can get the new package here:

$ wget https://dev.mysql.com/get/mysql-apt-config_0.8.6-1_all.deb
$ sudo dpkg -i mysql-apt-config_0.8.6-1_all.deb

Clean your old install, you will lose all the data. Be careful, back up is on you!

$ sudo apt-get remove --purge mysql-server mysql-client mysql-common
$ sudo apt autoremove
$ sudo apt-get autoclean
$ sudo apt-get install mysql-server

This is the way to go to test the new features such as Descending Indexes and others. Remember, the new default encoding was changed from latin1 to utf8mb4.

Short feature list:

The complete list is available here.

I don’t know Ops, and that may be OK

I am a Software Engineer at heart. I started as such and worked with PHP for about 7 years, always correlating my work with data somehow until I got an opportunity and decided to follow my instincts and be a Data Engineer.

I didn’t turn a Data Engineer from one night to another. It was a process. I was lucky to have a boss that noticed my skills with data and decided to give me room to play with it.

But the Ops part, was never my forte and this is why.

Data Integrity

This is my main concern. I am more worried about keeping consistency as much as possible and even in many times choosing it over performance.

Another trait I have is to be always looking for logic errors that may generate bad data into an application. I despise badly written models and have had a bit of problem when working with RDBMS and ActiveRecord. My take on it is: if you have a complex business model, what is easy may become painful. Also, there is no silver bullet solution, you don’t need to use only one technology.

But I won’t go into what is better, what is better it is what it works for you and make you application works and don’t let your users down while maintaining your data integrity.

Value

Your software is not valuable 99% of the times. Your software it is a means to an end. It is a path to interpret business logic and generate value to your company. And if your data sucks (duplicated records, lack of foreign key checks, extreme denormalization in the main DB if relational) you may have no real value at all.

Why don’t I know it?

When I needed doing ops, managing database servers, they were all single servers, at most a read replica on AWS, or only one slave.

One can say I’ve never needed actually to know it. I got lucky having good people working with me on the DevOps team, and we trusted each other’s work, in the end the managers would prefer to use my abilities in another area.

As I said before I do know it’s an area I need to improve, but it is ok to not know it, because even with me not being an expert on it my value lies in understanding the data, the data model and how application handles on data. As a DBA main job usually is to keep the database servers healthy, mine is to keep data itself as a valuable as possible to a company.

Not a DBA

What I do it is many times considered a DBA job, there are a couple areas where both can overlap but I try as a Data Engineer to support the Developers and the Business as a DevOps person do.

You may notice on my posts: I don’t write about replication, cluster, etc. One reason is: I have never in real life have to deal with this particular area deeply. I do know I should know more about that. But one thing is to set it up a couple servers on the cloud with no real data to analyze and performance issues to attend than doing it in real life. However, that doesn’t lessen my value.

I know how to prepare data, I do ETL’s, I do data modeling, I do deep research on which storage would be the best for a case scenario, I help to define policies around migrations and data access, for instance.

My Goal

My goal is to help developers. It’s to help them do the right thing regarding to data as they do regarding with test coverage and code quality.

Everybody reviews code, I rarely seem people reviewing data models. And I want to help to create a culture where people see data as their true value. Remember this: data leaks are more valuable than “code” leaks and potentially more devastating too. So please, let me help you.

MySQL version poll: a not so scientific analysis

MySQL version poll: a not so scientific analysis

Prior to my talk at LaraconEU 2016 I was curious to know how much adoption for MySQL 5.7 was in within the community.

I tweeted this:

Twitter polls only gives you up to 4 items to choose. What I wanted to know is if people were using MariaDB or other forks like Percona, but I didn’t had the proper space, and I  only put three options.

This January I managed to get a bit more syndication on my tweet and more people replied. I added a 4th option, “Other”. This option could include the fork data as well as people using even the MySQL 4:

Analysis results

This have no scientific foundation whatsoever. Most of the people on my twitter bubble work on tech and try to be using cutting edge technology, but I could see a bit of a trend (taking into the consideration also the amount of people that now replied).

August 2016 January 2017

It is possible to notice that 5.7 got more market where 5.5 was the most common version to those people. I would like to think they upgraded first to 5.6 to then upgrade to 5.7 and not just jumped versions disabling and doing this to make it work:

SET @@GLOBAL.sql_mode = '';

Again, this is the equivalent of disabling errors in any language because you are not gonna fix them, just want swipe under the carpet. Don’t do that.

It is nice to see that 5.5 is losing ground (again, a pinch of salt here) to newer and modern versions.

What should I not consider?

Well, you can actually ignore the whole poll as a trend indicator. The first one ran only for a day and it got 85 votes with not all options on it, and the second one had 669 votes and it was a week long poll. Plus the fact there is no way to do a control group to calculate the error margin.

What does this really mean?

MySQL 5.7 was released with General Availability around October 2015, major hosting  and cloud companies started to make it available on February/March 2016. Adoption always take a bit of a time to be absorbed, specially if you have to do any code change to support the new version of the database (hint, you probably will have to). It also means that those companies may at any point stop providing support for versions older than 5.6 (5.5, 5.1, etc.).

Also take into consideration that MySQL 8.0 is under development and most of the strictness embedded by default on 5.7 will continue to come on 8.0. So if you are reading this blogpost and starting a new project, go ahead and start with 5.7 already so when version 8.0 comes out you won’t have trouble upgrading.

If you have a legacy application then, there are ways of adapting your code so you can enjoy everything the new version has to offer. Just a final reminder, disabling strictness on the server to be able to use the JSON feature may sound as a smart idea in the beginning, but that also means putting your data consistency at risk.