Sunday, 13 August 2017

How to have a higher chance of success when restoring a big MySQL database

Restoring a MySQL database is fast and easy when you just copy files in datadir when the server is shutdown, or if you use Percona xtrabackup.

But if you for some reason (AWS RDS) only have MySQL protocol available for backup, you usually can have a compressed mysqldump, that is quite slow to restore, not because of the compression or because the decompressed version is a text file that needs to be parsed, but because MySQL is slow to push it through it's disk pipeline, and because it needs to build data indexes while doing a restore.

I've spent multiple days babysitting the process of restoring a 7GB gzip compressed MySQL dump file, and these are results and tips that could help you save some time.


So, make sure that:
- you have enough IO available: For restoring a 66 GB datadir 315.6 GB was written to the drive (as measured with iostat), with a tuned MySQL configuration. For a DB of this size a mechanical drive doesn't cut it, and restore will take multiple days. Use a good SDD.

- your database TRIGGERS all have BEGIN/END statements (even though you can create them without and even thought the bug was supposed to be fixed https://bugs.mysql.com/bug.php?id=16878),  it fails on restore, with all versions of MySQL 5.7/5.6 i tried

- you start with a really empty database in your datadir - DB I worked with had inconsistent data types on a foreign key, when the dependent table with an inconsistent key already exists MySQL will report a foreign key error (MariaDB will be more informative), but if it doesn't it will happily restore the database

- your max_allowed_packet conf value is big enough, or you'll get a MySQL server has gone away message from your client while restoring.

- your innodb_log_file_size is big enough (https://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_log_file_size) - if you have large BLOB values in your DB, restore will fail if the value is lower than 10% of your blob field. This setting is important for quick restore too

- you have log-bin turned off in order to minimize your chance to run out of drive space and save IO (log-bin=Off doesn't mean that it's disabled just that the log bin files start with Off, the documentation can be confusing here :)  What worked for me is having all log-bin lines in the mysqld config section commented out


Finally, if you want it to finish quickly, use the fastest SSD you have available, and consider tuning MySQL configuration a bit. I'm also considering using a ramdisk, because it would help both with restore speed and when you need to do some DB transformations. MySQL defaults are not reasonable, especially for innodb_log_file_size, max_allowed_packet.

I used excellent pv for figuring out if the restore process will finish in reasonable time

pv db_dump.gz |gunzip -c |mysql -uroot database_name



Here's a full list of my MySQLd configuration variables that worked for me on my dev laptop

#my dev laptop is low on memory, for prod server you would use a lot more
innodb_buffer_pool_size=512M
innodb_additional_mem_pool_size=256M
innodb_log_buffer_size=256M
innodb_log_file_size=512M
max_allowed_packet=64M

#for saving disk IO, dont use on prod
innodb_flush_log_at_trx_commit = 2
innodb_flush_method=O_DIRECT_NO_FSYNC
skip-innodb_doublewrite




Sunday, 30 July 2017

Some thoughts on (Modern) PHP



I have experience in both Java and PHP. Java mostly for traditional desktop apps, embedded UIs and PHP for websites.


Custom PHP frameworks I've built or helped build took into account the way PHP is executed: you are stateless and need to setup everything on every request (runtime is fast to start with FPM and opcode cache). Namespaces based cheap autoloader worked great. We used singletons for getting the configuration and connections to DBs. There was almost no setup code that needed to be run every time other than loading .ini based configuration and connecting to the DB. My webapps responded under 20ms (DB and other services like sphinx included), and I could get it to respond in 1 ms for things where we needed to be quick and didn't have to output HTML with Forms. It was really small and you could read the whole framework code in 1-2 hours. It worked with SQL in a reasonable way. You didn't have to write your SQL for simple CRUD, but for larger things involving joining multiple tables and more complex expressions we wrote native SQL. Caching was done thoughtfully, using  APC user cache (SHM with zero copy). It just felt nice.


I switched jobs recently, and started with Symfony 3. The thing felt like some Java framework, but poorly documented and harder to use than it should be. It had lots and lots of setup code done before handling every request. There's a whole DI framework with it's load of setup code for every component. And you have to do setup even though you don't use the component in that particular request. There are ways of doing setup lazily, but you still waste time to wire that up. Framework overhead can be 30-100ms. Other modern PHP frameworks often have similar overhead. I know that there's PHP-PM, to save some of that work that isn't really $_REQUEST specific, but it doesn't seem to be used much for production. And using Silex (deprecated by Symfony 4?) is really not that different, you still either reuse Symfony components, or rewrite them, but with similar "best practices" that are inspired by Java.


Regarding persistence, Doctrine and it's verbosity feels very ugly to me. I'd much rather use SQL syntax for defining relationships than bunch of PHP with special syntax comments or xml or yaml. And also use real SQL for complex queries.



Everybody is using type hints wherever they can, and it feels as verbose as Java, but without compile time type safety, and you can't really put type information everywhere (class members for example).


So you are almost using a type safe language, but can't get performance or compile time benefits, because inevitably, you'll have to use some dynamic typing or other dynamic language features.

Even though PHP runtime has made great progress with 7.x (it's probably the fastest interpreted language, and it's great it has reference counted garbage collection), it feels like language is struggling to find it's identity, with it taking a lot from Java and still coping with ugly legacy ($, having to use $this inside a class function, php.ini, features for supporting templating even though it's rarely used as a templating language in modern frameworks, https://3v4l.org/mcpi7).


Learning Python and Flask (as an example) was much more enjoyable than switching from a nimble custom PHP framework to Symfony. Using NodeJS and minimalistic components to build my own framework was also nice. I'd love to try GoLang, Swift or Rust in the backend too.


And there's the thing that most of the PHP frameworks try too hard to be full stack, when nowdays it's not rare you only do REST APIs on the backend. So there's a lot of infrastructure and assumptions in place for rendering HTML that you really don't need to use and that gets in the way when learning the framework and is wasteful when the code is executing.


I'd argue you can write fast, simple and maintainable PHP, by using state PHP runtime has setup for you ($_POST, $_GET, $_SERVER etc), namespaces and a namespace based autoloader, trying to use pure functions when you can (using static classes shouldn't be a sin - use it to split your code in sensible parts), and using general good practices for writing readable and maintainable code (avoid long functions, huge classes, too much block nesting, decoupling, naming things in a good way). With some coding conventions you can write a decent and productive framework quickly, but you could do that with a nicer language too, so what's the point?

(Thankfully, I'm not using Symfony on my new job, and Yii2 does suffer from some issues too, it at least feels better for now)




Saturday, 14 May 2016

Kabelski internet i oversubscription


Ovo je post iz 5.11.2014. U međuvremenu sam promjenio kabelskog operatera 

Ako imate kabelski internet to znači da najvjerovatnije koristite jednu od sljedećih kabelskih tehnologija za prijenos digitalnih podataka:
- DOCSIS 1.0, 1.1, 2.0, 3.0 ili EuroDOCSIS standardi
- PacketCable 1.0, 1.5, 2.0 standardi koji na DOCSIS bazi grade razne usluge poput telefonije i digitalne televizije

Frekvencijski pojas svakog kabela podjeljen je na kanale. Širina kanala ovisi o standardu pa tako EuroDOCSIS koristi europsku širinu kanala od 8 MHz , a DOCSIS koristi američku od 6 MHz.

Podjela bandwidtha koaksijalnog kabela (Maksimalni downstream bandwidth koaksijalnog kabela je 4864 megabita prema primjeru niže)


Svi spomenuti DOCSIS transportni standardi imaju slične karakteristike oko toga koliku downstream propusnost podržavaju po jednom megahertzu, pa tako DOCSIS podržava 38 megabita po kanalu downloada, a EuroDOCSIS 50 megabita po kanalu downloada.

DOCSIS 1.1 je donio bolju standardizaciju i mogućnosti kontroliranja kvalitete usluge (QoS)

DOCSIS 2.0 je donio bolje upload brzine (27 megabita po kanalu u odnosu na DOCSIS 1.0 9 megabita po kanalu)

DOCSIS 3.0 je donio mogućnost da jedan korisnik istovremeno koristi više kanala tako povećavajući bandwidth.

DOCSIS 3.1 izdan u Listopadu 2013. je prva veća promjena u standardu jer donosi novu modulaciju 4096 QAM i odustaje od podjele kanala na 6 ili 8 MHz i umjesto toga koristi manje OFDM podkanale i u idealnim uvjetima podržava brzine do 10 gigabita downstream i 1 gigabit upstream. Još nije u primjeni.

E sad, sve je to divno i krano, ali zašto je uz takve ogromne brojke moj internet spor?

Koaksijalni kabel je medij koji dijelimo sa drugim korisnicima, za razliku od DSL-a gdje svaki modem ima vlastitu bakrenu paricu do centrale, kod kabelskih mreža dijelimo medij sa neodređenim i samo vašem ISP-u poznatim brojem korisnika. Obično operater nudi i uslugu kabelske televizije te je prostor za vaš internet sužen sa brojem kanala koji se koriste za TV uslugu.

Ajmo vidjeti jedan primjer u praksi na zagrebačkom području, za downstream:



Motorola SBV5121E

Koristi se modem Motorola SBV5121E (DOCSIS 2.0 i niže), što prema specifikaciji [2] znači da ima bandwidth za downstream od 88 do 860 MHz sa američkom širinom kanala od 6 MHz. Znači 772/6 = 128 kanala. Operater koji sam analizirao po mom saznanju 40 analognih TV kanala i 113 digitalnih. Recimo da se za ovih 113 digitalnih troši 30 6 MHz kanala u kabelu. Što znači da recimo srijedu uvečer, kad se ljudi vrate sa posla i škole, samo 58 različitih kućanstava (kanala) može istovremeno surfati punom brzinom od 38 megabita, svaki sljedeći korisnik koji krene surfati smanjuje brzinu ovim ostalima. 
Graf latencije (do prvog hop-a) na primjeru Zagrebačkog ISP-a dok korisnik osim za mjerenje ne koristi uslugu.

Operater kojeg sam analizirao nudi brzine od 8 megabita, što znači da bi teoretski trebao moći dati traženi bandwidth za (38/8) *58 = 275 korisnika, no pošto se tu vrijeme provedeno na kanalu po korisniku mora smanjiti kako bi se jedan kanal podjelio na više kućanstava, u tim slučajevima, čak i da surfa samo 275 kućanstava, njihova latencija (ICMP ping) sa odličnih 6-7 ms počinje rasti na (worst case, puna utilizacija na 418 korisnika) 4.75*7= 33 ms (molim ispravak ako je računica netočna, uzimam u obzir najmanju veličinu ICMP paketa tj. najmanju moguću diskretnu jedinicu u kojoj je moguće ostvariti komunikaciju). 

Dodatni problem je što DOCSIS 2.0 i niži ne omogućavaju brzo prebacivanje među kanalima, što znatno otežava dobru iskoristivost frekventnog spektra kabela (možda na drugim kanalima ima značajno više prostora za prijenos podataka).

U svakom slučaju, ako je previše korisnika koji dijele isti resurs (isti 6 MHz kanal, isti kabel) dolazi do drastičnog brzine pristupa pa tako kod ISP-a koji sam analizrao bandwidth pada na ispod 1 megabita, a ping ide i iznad 140 ms, uz česti packet loss.

Rijetko kada svih korisnici žele istovremeno i na period dulji od nekoliko minuta maksimalni bandwidth, pa je moguće (prema brojkama u primjeru) imati 10x više korisnika nego što je ukupnog kapaciteta (npr. 418 korisnika na 8 megabita na 88 kanala nego 4180 korisnika) a da sami korisnici ne primjete probleme u brzini pristupa, ali to uvelike ovisi o načinu korištenja Interneta. Moguće je da će više učenja na daljinu, skidanja igara preko Steam-a i sličnih servisa itd. značajno promjeniti navike korisnika u budućnosti.

Posao dijeljenja bandwidtha kada je više korisnika od broja slobodnih kanala rade zajedno modem kod korisnika i CMTS uređaj kod operatera. CMTS radi mnoge slične funkcije koje u DSL sustavima radi DSLAM, ali uzevši u obzir karakteristike dijeljenog koaksijalnog medija. CMTS omogućava da i do 1000 korisnika dijeli isti 6 MHz kanal. Koristi tehniku zvanu Statistical time division multiplexing. Nisam našao na podatak može li jedan CMTS uređaj stvarno i napuniti svih 128 downstream kanala i još 60 Mhz upstream bandwidtha. Svakako mu za to treba barem 10 gbit ethernet sučelje.

ISP može poboljšati infrastrukturu tako da smanji broj korisnika koji dijele jedan jedini kabel, ili poveća broj kanala koji se koriste za DOCSIS ukoliko medij ima slobodne kanale.
Također, ISP može početi koristiti digitalnu TV kako bi iskoristio mogućnost digitalne kompresije video i audio zapisa i time smanjio potreban bandwidth po TV kanalu za bar 4 puta (moguće i više sa kompresijom naprednijom od MPEG2), no ovo znači da operater mora svim korisnicima zamjeniti receivere za TV, što može biti značajna investicija.

Osnovana je i Facebook grupa gdje se korisnici mogu požaliti na svog operatera ili raspravljati o boljim operaterima i tehnlogijama poput recimo FTTH ili VDSL-a.

Pridružite nam se na:

https://www.facebook.com/groups/hocuboljiinternet/


Linkovi:






Sunday, 1 May 2016

Smart public transport with small automated, semi-automated or manually driven vehicles

Here's just and idea (feel free to use it in any way):

Imagine having a network of small (4-6 passengers) vehicles servicing a city for daily transportation needs. Users would enter a desired location and arrival time. The arrival time could be flexible (within an hour, if not then the price could be appropriately higher) and the user would announce any regularity (for example detailing a weekly commute) that could be used for future planning.

The centralized system would optimize the problem of getting all passengers to their respective  locations and suggest departure time and location (preferably within a few minutes of walking distance).

An interesting open source implementation would use OpenStreetMap data and have simulations and visualizations. A commercial entity could deal with deployments on various locations and provide a stable software as a service around the core open implementation. Autonomous vehicles would provide much more efficient operation of such a network and lower the costs significantly.

Thursday, 6 March 2014

A tale of false alarm by ConfigServer, CPanel and a hosting provider.


I'm responsible for a couple of CPanel/WHM managed dedicated servers.

We  keep them updated, and try to do as little customization as possible outside of what cPanel knows about. We enabled mod_proxy_fcgi and PHP-FPM, so we can use Apache 2.4 MPM Event for our fairly high traffic web site. It's a unfortunate that CPanel doesn't have this configuration available out of the box, but that's for another blog post.

Today early in the morning we got a message from our lfd daemon (a service installed by a free ConfigServer Security & Firewall CPanel plugin installed by our hosting provider):

The following list of files have FAILED the md5sum comparison test. This means that the file has been changed in some way. This could be a result of an OS update or application upgrade. If the change is unexpected it should be investigated:
/usr/bin/ghostscript: FAILED
/usr/bin/gs: FAILED

The funny thing is, nothing upgraded any RPM files in this time window, our /var/log/yum.log didn't mention any upgrades to ghostscript package that provides the /usr/bin/gs binary (/usr/bin/ghostscript is a symlink to gs), we have disabled automatic updates that can be initiated by the cpanel upcp --cron sciprt, but the system us regulagrly kept up to date manually with yum update.

I've reinstalled the package with yum reinstall ghostscript (ghostscript-8.70-19.el6.x86_64 was reinstalled)

and the binary size and md5sum changed like this:

before:
size: 19152 bytes
md5sum: c64b5016d94450b476148c31cfef61ff

after reinstall:
size: 6760 bytes
md5sum: 73db43e258c4b191757b7ba75a883321

This is what actually happened: Our managed hosting provider had apparently changed our setup to upgrade our system packages automatically (probably with best intentions due to recent gnutls issue). And prelinking seems to be enabled on our system, so when upcp (CPanel automatic upgrade cron script that runs periodically) executed /usr/local/cpanel/scripts/rpmup to upgrade system packages, it also did the prelinking step, adding extra prelinking stuff to our /usr/bin/gs binary.

Similar issue described here:

http://linsec.ca/blog/2012/01/23/rpm-v-and-prelinked-binaries/


Friday, 16 August 2013

Dota 2 Wine optimization for Intel GPUs

Dota 2 for Linux implements it's 3D engine by using a Direct3D to OpenGL translation layer called ToGL. I assume that this layer can be used in different ways, but for Dota 2 it seems to be used in a less than ideal way as documented previously here. In short, Dota for Linux compiles 11000 shaders on startup, compared to just 220 the Wine version does. This causes much higher memory usage (1.2 GB vs 2.6 GB) and start-up time (35 seconds vs 1:15 min).


With Wine we actually do get the source of their Direct3D to OpenGL layer called wined3d, since Wine is open source. It's funny, the stack used to run Windows version of Dota 2 is actually more open.

Since Dota 2 for Windows when run on Wine actually outperforms native Linux version in some important aspects, and it's framerate is just slightly less, I decided to take a look on improving its performance.

I've used a tool called apitrace to record a trace of a Dota 2 session with wine so I can analyze the OpenGL calls and look at driver performance warnings (INTEL_DEBUG=perf) with qapitrace.

I optimized two things:

1. Reduce the number of vs and ps constants checked

There were many calls to check values of VS (vertex shader) and PS (pixel shader, also called fragment shaders in OpenGL) constants each frame like this:

532550 glGetUniformLocationARB(programObj = 152, name = "vs_c[4095]") = -1

This function is called from shader_glsl_init_vs_uniform_locations() in glsl_shader.c in
wined3d.

It uses GL_MAX_VERTEX_UNIFORM_COMPONENTS_ARB, defined to be 4096 in #define MAX_UNIFORMS in Mesa source.

Dota 2 doesn't need so many uniforms, most checks return -1, and wined3d checks all of values for both VS
and PS uniforms.

I reduced this number to 256, just enough for Dota 2. This saved thousands of calls per frame.

2. Use fast clear depth more often

Intel driver complains about not being able to use fast depth clears because of scissor being enabled. Turns out that device_clear_render_targets() in wined3d device.c doesn't really need to do glScissor for Dota 2, it's probably an optimization that maps better to Direct3D driver.


A small patch including both optimizations is here:
https://gist.github.com/vrodic/6437312

This patch is a hack, and glScissor part probably breaks other apps, so this is just for Dota 2. It maybe could be made in a better way so it could be merged in Wine, but I'm not wined3d expert.

So how faster is it? A solo mid hero on a setup described in the previous blog post used to get 41 FPS. Now it gets 46-49 FPS. Native version is similar to optimized Wine, but in some situations it gets worse than Wine optimized.

Ideas for improvement:

Dota 2 for Linux needs  ~7500 calls per frame. Wine version, even after my optimizations needs 37000 (EDIT: just as I was writing this post, there were some improvements, now its about 22000).

There is probably a way to optimize this even more, but it's outside of the scope of an afternoon project,  like this was. I'd like to keep on digging though.

Wednesday, 14 August 2013

Dota 2 performance: Linux/native vs Linux/Wine vs Windows 7 on Intel GPU

So how well does the mega popular game Dota 2 work on Linux? I've had some time to make detailed tests on my Intel IvyBridge GPU laptop (Lenovo ThinkPad X230). The graphics settings are the same on all versions.


Dota 2 Windows binary under Wine 1.6
Startup: 35 secs (over ntfs3g userspace fs driver that is not that fast)
Mem: 1.0GB
FPS: 37.5 FPS
Power usage LP mode (patched wine): 34W

Dota 2 Linux native
Startup: 1 min 14 secs (native ext4)
Mem: 2.6 GB
FPS: 40 FPS

Dota 2 Windows native - Windows 7
Startup: 25 secs
Mem: 1.2GB (measured by Windows Task manager)
FPS: 80 FPS
Power usage LP mode: 24W

Test setup:

CPU: Core i5 3320M

Resolution: 1366x768

GPU settings (same on all Dota 2 versions): shadows MEDIUM, textures HIGH, render quality: HIGHEST, all other: OFF, vsync: disabled

GPU settings LP mode (for power measurments above): Shadowd low, effects OFF, textures MED, render quality: LOWEST, fpx_max: 30

Mesa version: git-8b5b5fe (with rendering regression fix from here: https://bugs.freedesktop.org/show_bug.cgi?id=67887)

Linux distro: Ubuntu 13.10, kernel 3.11 drm-intel-nightly, running LXDE

FPS Benchmark method:
in "dota 2 beta/dota/cfg/autoexec.cfg"
cl_showfps 2
playdemo test

For FPS measurement number: look at the last 240 frames when the demo is ending

Memory measuring: RES column with `top`
Startup time measuring: stopwatch until the map is loaded

Analysis of the apitrace trace file:

I've made a trace of Dota 2 with apitrace, revealing possible performance issues.

Before the first frame of the game is drawn 11038 shaders are compiled. That is most likely why the load time is so slow and memory usage is so high. In addition a lot of the shaders being used seem to be recompiled by the Intel driver when rendering frames.

There are 162 frames in the trace I've analyzed, 193 shader recompiles, and 643 different shader programs (each program has 1 VS and 1 FS) used.

In contrast, Wine version of Dota 2 compiles only 220 shaders.

Performance feedback from the Intel driver:

glretrace of apitrace prints driver performance warnings. A sample of some that repeat every frame. These include shader recompile warnings.

575332: glDebugOutputCallback: Medium severity API performance issue 13, Clear color unsupported by fast color clear. Falling back to slow clear.
576094: glDebugOutputCallback: Medium severity API performance issue 14, Failed to fast clear depth due to scissor being enabled. Possible 5% performance win if avoided.
577739: glDebugOutputCallback: Medium severity API performance issue 4, Using a blit copy to avoid stalling on 480b glBufferSubData() to a busy buffer object.
577801: glDebugOutputCallback: Medium severity API performance issue 8, Recompiling vertex shader for program 7901
577801: glDebugOutputCallback: Medium severity API performance issue 9, Didn't find previous compile in the shader cache for debug
577801: glDebugOutputCallback: Medium severity API performance issue 1, Recompiling fragment shader for program 7901
577801: glDebugOutputCallback: Medium severity API performance issue 10, Didn't find previous compile in the shader cache for debug
577801: glDebugOutputCallback: Medium severity API performance issue 3, FS compile took 2.266 ms and stalled the GPU

Warnings in sequence in URL below:
https://gist.github.com/vrodic/6235313

It's interesting that most of this warnings were added by request of Valve back in 2012:
http://lists.freedesktop.org/archives/mesa-dev/2012-August/025288.html




Conclusion:

If you have a memory constrained machine and want to run under Linux, maybe using Wine is a better choice.

I hope Valve cares enough about Linux to fix what they can on their side and work with folks from Intel to fix their performance problems.

You will probably be luckier if you run it on nVidia GPU, since people in general are claiming performance very similar to Windows. Though probably still with slower startup times and higher mem usage.

Some info on how to compile Mesa 32 bit on 64bit Ubuntu:

Unfortunately this doesn't work with old Ubuntu versions, just with 13.10. For old version you must also remove 64 bit versions of the compiler, which was a bit too much of a requirement for me.

apt-get install gcc-multilib g++-multilib binutils-multiarch libx11-dev:i386 libdrm-dev:i386 

and as configure complains install others

apt-get build-dep mesa -ai386 

wants to install too much and remove some 64 bit stuff I need, and we don't actually need llvm i386 dev stuff for  compiling just the Intel driver.

I use this script to compile just the i965 driver for mesa:

sh autogen.sh
make clean
export CFLAGS="-m32 -O3 -mtune=native -march=native -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security"
export CXXFLAGS="-m32 -O3 -mtune=native -march=native -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security"
export PKG_CONFIG_PATH=/usr/lib/i386-linux-gnu/pkgconfig
./configure --disable-egl --enable-glx-tls --with-gallium-drivers= --with-dri-drivers=i965 --enable-32-bit
make -j4