déc 24

Toutes les personnes ayant contribué à ce blog en cette année 2010 (Fabrice, Ryan, Aymeric) se joignent à moi (Philippe) pour vous souhaiter d’excellentes fêtes de fin d’année !

Modern Glass Christmas Tree

écrit par Philippe Humeau

déc 21

Magento : un cache capricieux ?


Voici un petit article composé suite à des recherches menées conjointement par NBS System et la Magento Academy.

En effet, le directeur de production (Emile pour les intimes) a touver un comportement étrange lors du paramétrage des systèmes Magento > 1.4.0. Avec l’apparition du slow backend, on avait en effet des résultats beaucoup plus « slow ».

Fabrice nous a proposé d’élucider certains mystères en allant faire de l’inspection de code. Voici les premiers retours :

Magento 1.4 : Changement du système de cache


Depuis la version 1.4 de Magento un nouveau système gérant le cache à été mis en place appelé cache à 2 niveaux.

Le premier niveau appelé fast backend est un cache rapide et est le cache principal et le second niveau, le slow backend et est un cache de secours en cas d’indisponibilité ou de saturation du cache de premier niveau.

L’utilisation de ce Slow Backend est imposé dès qu’un cache de premier niveau dit « rapide » est configuré.

Autrement dis même si vous ne configurez rien en slow backend, le simple fait d’utiliser APC, Memcached, Xcache ou Eaccelerator activera le cache de second niveau avec pour type un file cache.

Configurer son cache correctement

Magento attends comme valeurs de configuration dans le fichier local.xml les nœuds suivants :

<backend /> pour le fast backend

<slow_backend /> pour le slow backend

Magento attends normalement les valeurs suivantes (attention à la case) pour le noeud <backend> :

  • file
  • sqlite
  • memcached
  • apc
  • xcache
  • eaccelerator
  • database

Toutefois pour le noeud <slow_backend> Magento attends les valeurs suivantes sous peine de plantage (attention à la case) :

  • File
  • Sqlite
  • Memcached
  • Apc, Xcache
  • Varien_Cache_Backend_Eaccelerator
  • Varien_Cache_Backend_Database

Un bug semble présent dans l’Admin quand le système de cache à 2 niveaux est présent, en effet dans Système > Gestion du cache, tous les caches semblent désactivé même s’ils le sont bien, il semble que ce soit un bug qui peut être corrigé facilement (même si l’on vous déconseille de patcher le Core a moins d’en avoir un suivi très précis) :

Dans le fichier app/code/core/Mage/Core/Model/Cache.php

Méthode _initOptions()
La ligne
if ($options === false) {
  devrait être
if ($options === false || $options === null) {

Toutefois ce bug ne semble se manifester que si le cache APC est plein.

Ce cache APC d’ailleurs se remplit en continue et n’est pas « rafraichit » assez vite, ce qui fait qu’on tombe vite dans le slow backend qui, s’il n’est pas configuré, atterrit sur les disques dur (ou en RAMdisk si vous en avez paramétré). Il reste possible de le nettoyer en l’effaçant mais ce n’est pas optimal.

Des arbitrages déroutants

Magento vérifie si le cache a suffisamment de place libre avant de faire les mises en cache. Pour APC c’est la consommation de mémoire qui est vérifiée, sur le taux d’utilisation dépasse 80% le cache APC est ignoré. Déjà, 80% de 64 Mo ou de 1 Go ce n’est pas la même chose, pourquoi ne pas mettre un seuil en Mégas plutôt qu’en pourcentage ?

A l’écriture Magento écrit dans APC à l’unique condition que son taux d’utilisation soit inférieur à 80%, dans tous les cas il écrira dans le slow backend.

A la lecture seul le fast backend est utilisé, s’il n’existe pas ou n’est pas disponible Magento lira dans le slow_backend

Notez que le cache à 2 niveau possède une option auto_refresh_fast_cache il semble bizarre car il ne fait que remettre en cache les données chargées depuis le cache repoussant ainsi sa date d’expiration, cela ne se fait que si le taux d’utilisation du cache est < 80%.

Conclusion sur le cache Magento 1.4+


L’écriture du slow backend est donc systématique mais ne se fait que si le fast backend doit écrire lui aussi, donc si tu ouvre une page le slow et le fast backend sont générés. Si par la suite on supprime les fichiers du slow_backend mais pas du fast, les fichier du slow backend ne seront régénéré que si le fast backend l’est aussi. Tant que le fast backend a un cache valide les fichier du slow backend ne sont pas régénérés.

Il n’y a donc pas de bugs majeurs dans la configuration, elle est juste un peu subtile à comprendre et non documenté ce qui n’aide pas. Par contre, le principe de l’utilisation du Fast et son taux de renouvellement trop faible en font un système moins efficace puisque le Fast est presque tout le temps plein, des données les plus anciennes.

Merci à Emile pour la piste et Fabrice pour l’analyse.

écrit par Philippe Humeau \\ tags: , , , ,

déc 15

Introduction

After the success of last semester MDA (Magento Darwin Awards), we decided to publish the 2010 update.

The concept, for the new comers, is to elect the best « worst idea », the most unpredictable « victory of mind over matter », simply put, the best of the best failures we had to handle at our support desk. This is my community blog but I’m the CEO of a European hosting company called NBS System (our site will soon be available in English also).

We are currently handling managed servers in the E-commerce field, mainly for Magento. We are the biggest hosting company around Magento hosting in France with >400 hosted sites, so we have a lot, a very big lot of support requests and some are just « too big to be silently classified ».

As everyone, we also make mistakes, so we have our own entries in this MDA! One more thing, we never laugh with a bad spirit, we laugh because this is a way to lower the pressure, it’s never made in bad spirit, to « taunt » people or the make fun of them, it’s just for everyone to enjoy this little bestof ! As a matter of fact, in this issue, we ranked second, lucky us…

The concept is inspired of the world renowned « Darwin Award » which elects the people who were kind enough to clear the human gene pool from their flawed genome by dying in the most stupid possible way.

Enough of introduction, let’s make it directly to the Winners; here are the Winter 2010 “Magento Darwin Awards”!

Rank 1: Virtualization is Evil

Customer’s Tech Team: Boss, we should consider moving this App to a Virtual Machine.
Customer’s Boss: No. Virtualization consumes a lot of resources. We’ll use Cloud instead.
Customer’s Tech Team: But I think that Cloud is based on virt…
Customer’s Boss: I don’t care about your thinking’s, I said we’ll do Cloud!

/* Later */

Customer’s Tech Team: Okay, our boss had a brilliant idea; we won’t take the Virtual Machine but a Cloud instance.
Tech Team: ??? But the Cloud is…
Customer’s Tech Team: I know… But my boss doesn’t and, in a way, I think this is better like this. I bet he has seen a Microsoft ad yesterday…
Tech Team: So be it!

Rank 2: SEO Genius

As we previously did, I’d like to include our own, subtle but yet stupid mistakes. Everyone has failures and even with the best procedures good will and attention, well, you know the deal, shit happens…

SEO Provider: Hi Philippe. I’ve been doing a quick check around your site (www.nbs-system.com) and found some pretty unexpected results in your SEO.
Me: Like what?
SEO Provider: Have you made any special effort to be ranked on « expresso coffee machine » or « vacuum cleaner robot »…?
Me: Well not really, beside this is not really related to hosting or security, so… no.
SEO Provider: Well now you are ranked 3 on this! Cheers. With a sub domain being [test-customer].nbs-test.nbs-system.com
Me: No… Don’t tell me theses sub domains are publicly available for the bots?
SEO Provider: Okay, I won’t tell you.
Me: Damn, I think we should also be positioned on high heels, sex toys and running shoes since those are the customers actually using the test domain.
SEO Provider: Yep. But up to my statistics, you really are losing positions on sex toys…
Me: OMG…!

Rank 3: Wake me up!

Customer: Your 24×7 support service wasn’t reachable this morning at 6h50 am. I’m paying you every month for a « so called » 24×7 support and your guys aren’t reachable, shame on you.
Me: Well, let me dig into this problem and I’ll find the one responsible for this very sensible mistake and get back to you.
Customer: You’d better…

/* Later on */

Me: Guyz, give me your support cellphone logs. I want to know if someone missed a call this morning at 6h50 am, big issue there.
Support team: Okay boss, here is mine, pay a visit to the log, I never ever missed a call.
Me: I look in the log book and… There’s no incoming call at 6h50… I take a picture of the cellphone screen and send it back to the customer with some nice word around.

/* Later on */

Customer’s tech team member: I double checked my phone log and last monday at 6h50 am, I dialed 06.xx.77.xx.63 instead of your number 06.xx.77.xx.93. So yes, I didn’t had someone of your service but… It was barely undetectable on my end since the guy told me « Yes ? », I explained the problem about our servers being overloaded during 1 minute and a half and in the end he said « Yes no problem, I’ll take of that » !!! Sorry about that…

What happens in the JVM stays in the JVM

Customer: I had awful sales This Week End.
Support team: Well, we can’t really help in this do we ?
Customer: Oh yes you can, there’s only 3172 products available instead of 1.4 Millions Sku’s.
Support team: This can explain the sales were “limited”, let me have a look…
Support team: After we discussed a bit with you web agency, we found that SOLR servlet died while indexing your catalog, after just 3172 products were indexed.
Customer: And you didn’t detected anything ?
Support team: Well the JVM daemon was still there, and as we use to say “what happens in the JVM stays in the JVM”, so no, we weren’t able to see the problem.
Customer: Any way we can spot this for later safety?
Support team: Your web agency provided a test code to run with the crons to check if the servlet are healthy or not.

Phpinfo()

Customer: Is there a way on your systems to get informations like the one phpinfo() could provide ?
Support team: Well you can execute PHP on the servers so why not use phpinfo() ?
Customer: Oh, yes, brilliant… welllllll, sorry…

Wait, I’m looking for a solution…

Support team rookie: We didn’t found the problem yet but we are still looking for it and we’ll get back to you as soon as there is a solution to the problem you reported.
Support team Boss: Just tell me you didn’t send that mail to the customer…

Look, dady is on TV!

Customer (on a VPS server): I’m proud to announce I’ll be on a national TV show!
Support team: Well, nice success, are you planning to talk about your site?
Customer: Of course I will, this will attract a lot of customer to my site?
Support team: And what are you considering to handle the hundreds of thousands hits that your site will take ?
Customer: Uh why ?
Support team: Because you will get more that 50 times normal trafic so your server won’t take it?
Customer: No problem, do what’s needed, I’ll pay.
Support team: When is the show exactly?
Customer: In one hour and a half…
Support team: Praying will give you the best results!

My flash is beautiful

Support team: 4 films of 10 Mo per film streaming in Flash on the homepage is a lot.
Customer: Well this is not much actually, the advertising company made its best to lower the size of it.
Support team: Well compared to the average 1.8 Mo average surf session, this is still quite a sensible amount of data.
Customer: Could this explain our bandwidth consumption?
Support team: This is a serious lead J

We are hosting the NASA (returns)

Support team: Is this site really taking 300 000 Unique visitor per day?
Customer: No.
Support team: Well it seems the reverse proxy is counting this precise amount and your 4th web server just died under a 280 load… (12 cores => 12 is 100 % load without a process waiting to be handled). The second one is responding every odd minute.
Customer: Okay, I think someone made a small mistake in retargeting campaign, we take hits for anything. People are looking for chainsaws and get our retargeting ads about underwear.
Support team: Perhaps the retargeting algorithm thought that people usually interested in chainsaws leaving a site usually by underwear?
Customer: Sure dude, let me just hang the retargeting company, consider this a preemptive move.

700 000 mails sent for only 80 needed?

Support team: One of the mailer is dying!
Support team Boss: Why is it overloaded?
Support team: Someone is trying to send 700 000 mails…
Support team Boss: Shoot the queue, drop the incoming connections from this customer on the TCP/25, call Yoda and tell him to send reinforcements.
Support team: Ok Boss!

/* Later */

Support team (to customer): Why the hell did your site tried to send 700 000 mails in one hour?
Customer: I created a loop in PHP to send 80 mails. It was supposed to parse a file and send the mail to each line. I guess it failed…
Support team: In a way, yes… At least did the 80 persons got their mails?
Customer: No.
Support team: Failed. You’ll probably be in the MDA 2010 with this one.
Customer: Fair enough…

We commit to production this Week End

Customer: We will commit our preproduction to production this WE.
Support team Boss: No way. It will fail, you doesn’t have the 24/7 support option and your developers aren’t there, this is plain suicide.
Customer: No no, not a problem it’s just dropping my existing catalog and importing new products, I can handle that alone and beside, the dev team left me a script.
Support team Boss: This script has been tested?
Customer: I guess yes.
Support team Boss: No way.

/* Later */

Customer: The process failed, it’s Saturday and I can’t do anything, can you help?
Support team: Did someone told you that Week end migrations generally fail?
Customer: Well, yes…
Support team: This person was wrong, the correct sentence was « It always failllll the week end ».
Customer: So what do we do?
Support team: Well, we’ll try to fix that for you, but you’ll be billed.

(PS : My wife hate you, but this is only because she doesn’t like to be awaken at 4 am, nothing personal)

AND… Medal of honnor: The Slider O’ Death returns!

Live, in production, even more beautiful than the « project » which won the previous award.

(Excited) Customer: Hi NBS System team, I’d like to inform you that we’ll be on TV tomorrow!
Support team: Well glad you made it to people screens! We’ll watch your servers closely this evening and provide additional boosters if needed.

/* No…. come on, as usual, this was not the way things happened. What really happened */

(Annoyed) Customer: Hi, my servers are very very very slow.
Support team: I’ll have a look ASAP.

/* support guy working */

Support team: Well I found a hint. Part of the problem is that you have a lot of incoming connections, another part of the problem is that your site is consuming way more resources than before, especially the part called « Design your own xxxx ». Does those inputs help you, can you provide me with more information regarding this?

(Guilty voice) Customer:
Well, perhaps we should have informed you that we were on a major TV show tonight?

Support team:
Noooooo… this is not like if those nasty little TV appearance were quadrupling your incoming traffic, no why bother with those details? Sill this only explains about the overload, we will start a dozen virtual machines and your site will be taking the load in 5 to 7 minutes. But why is it consuming 10 times more resources, especially in the databases requests, that have made a peak to … WTF? …  37 000 SQL requests per seconds???

(still feeling guilty voice) Customer:
I don’t really know what you’re talking about, I’m not IT.
Support team: Did you put any new stuff online, like a new feature?
(Proud) Customer: Yes a very nice feature where you can design your own [customer_product].
Support team: Is it something where you can choose among ten types, 8 different sizes, 6 shapes, 10 colors and so on?
(Very proud) Customer: Yes, and it show you live your product among our full catalog of hundred thousands of articles.
Support team: OMG, The Slider O’ Death finally got its way online!!!

For those who didn’t know what the SOD concept was, they are the winner of last Magento darwin awards contest. You can find the full review of this wonderful concept here. Simply put, this is a way to interrogate the DB live, using AJAX requests, as soon as you move a cursor on a slider. 10 position, 10 colors, 6 shapes, etc… 10*10*6*8*… = thousands of requests per second, move your cursor faster, it makes thousands of requests per seconds.

You know, those little sliders, looking so cute (and loading live the products):

slider

NBS System Support team wish you a very good Xmas, wonderful end of year sales. We’d like to thanks every support cell around the world and especially the hoster’s one and Magento’s one, Cheers guy !

Emile aka « Imil », aka The_Boss

Photo Emile

Adrien aka « Ze »

Head with interrogation point. The intensity of the doubt.

Denis aka « Jawa »

DSC_0903

Christophe aka « Chris »

DSC_0889

Guillaume aka « Champitoad »

IMG_0141

Florent aka « Flo »

Head with interrogation point. The intensity of the doubt.

écrit par Philippe Humeau \\ tags: , , , ,