There’s a number of Magento tasks that, despite four plus years with the platform, I’m still bad at. One task is remembering to test my code for compatibility with the flat category and flat product collection mode.
After shipping code to a client that had a flat category bug (and then quickly fixing it) I had two clear choices. The first was to get serious about methodically testing any code that leaves my computer. The second was to come up with some crazy development methodology that would allow me to continue being lazy.
This article explores both the implementation details of the flat catalog features, and my new solution for ensuring my code is compatible with both.
Trusting the Abstractions
One technique that allows a Magento developer to succeed is a day-to-day blind trust in the abstractions. When you’re working with a lighter-weight framework, you don’t necessarily need this blind trust. When you say
you’re aware that the simple ActiveRecord-ish ORM is making a query that looks something like this
However, trying to stay aware of these underlying details in Magento, especially when you’re starting out, is impossible. Instead, trust that Magento’s objects will load the information you need, and only worry about the specifics if things stop behaving as you expect. Over time you’ll start to get an understanding of how Magento’s various sub-systems work, and which parts use ActiveRecord, EAV, or custom SQL.
This approach has served me well, but it’s also part of the reason I keep letting flat catalog bugs into my code.
The Flat Collections
Most models in Magento have a corresponding collection object, and this collection object can be instantiated by calling the model’s
Collection objects are used to fetch multiple instances of a particular object. If we could talk to Magento in plain english, that would sound something like
Hey Magento, give me all the CMS page objects whose titles contain the word “science”
The Magento product and category objects have collections
However, both these collection objects present a performance problem. The Product objects are EAV objects, and fully loading each one requires multiple SQL queries. The Category objects are also EAV objects, and in addition to that the category collection is implemented as a SQL heavy tree based data structure.
To solve these performance problems, Magento introduced the idea of “flat” data. Reimplementing these models and collections in non-EAV/tree terms would have been a huge engineering effort. Magento’s entire “highly configurable” product model relies on EAV attributes and collections, and the category editing UI would be difficult to recreate without the nested tree features. Even if Magento had taken all this on, a lot of third party code would have stopped working.
Instead, Magento implemented indexers which will periodically query the standard collections and populate flat database tables
These tables have non-normalized product and category data that’s intended to be read only. This allows Magento to fetch category and product data in a single query.
Of course, we need to tell Magento is should do this instead of loading data from the primary EAV tables. Turning on flat catalog mode gives us a performance boost, but at the cost of some functionality, so Magento put that decision in that hands of its users. There are two configuration flags at
System -> Configuration -> Catalog -> Frontend -> Use Flat Catalog Category System -> Configuration -> Catalog -> Frontend -> Use Flat Catalog Product
Magento will reference these flags (via helper methods) when it instantiates catalog or product objects. If set to
Yes, Magento will instantiate flat resource models, and these new resource model classes will reference the flat tables for reading data
In grand Magento tradition, this has been implemented slightly differently by the teams/individuals responsible for each feature. For categories, this is implemented during model construction
whereas for products it’s implemented during construction of the collection itself
Ignoring internal Magento software architecture politics, what this means is when Magento is running in “flat” mode the category collection objects is a
as opposed to
when running in “normal” mode.
Products are a little trickier. There’s no flat collection object — instead the main collection object creates different SQL depending on which mode Magento is running in (complicated by further different SQL depending on a cached loading vs. a normal loading). However, if you call the product collection’s
you’ll see different classes depending on which mode Magento is running in
As you can see, despite the two different approaches, the factory pattern is leveraged extensively in the flat catalog features.
OOP Gone Wrong
So why do we care? In theory, we shouldn’t need to. As client developers we can continue to use the Magento abstractions regardless of which mode Magento is running in and we’ll get back the information we’re after.
Unfortunately, while it’s clear the original Magento core team excelled at principles of object oriented programming, not all team members were on the same page. If you’re going to do the sort of polymorphism that lets you swap in the class
for the class
at runtime, the replacement class should be interface compatible with the original. As many Magento developers learn time and time again, this is not the case.
Here’s a concrete example of what I’m talking about. I was recently working on a feature that required getting the product count for each category. I did some digging and discovered if I loaded the category collection with the following code
the categories would load with a product count. Pleased with myself, I implemented the feature and moved on to my next task.
However, during testing the QA team received the following error
I had been developing with flat category mode off. Unfortunately, the
setLoadProductCount method didn’t exist on the category collection which triggered a fatal error when the QA team tested the site in flat category mode. The two collection objects weren’t interface compatible.
(For developers just entering the field, “QA” stands for quality assurance. In ancient times companies hired teams of people to test code and features before shipping them to clients. A few companies still practice this arcane art.)
While my philosophy of “trusting the abstractions” usually steers me right, in this specific case it steered me wrong, because I assumed the collection object would operate the same in flat vs. non-flat mode.
Solving the Problem — with Chaos
All this brings us full circle. While I quickly fixed the specific problem in my code, there’s still the question of how to fix my bug of always forgetting about these collections.
I could double my billable time by developing in non-flat mode, testing in flat mode, fixing problems, re-testing in non-flat mode, and continuing the cycle until everything works. As much as that sounds like a good idea on paper, in practice it means I’d be burning a lot of cycles on things that weren’t really a problem for the rare instance when they were a problem.
There’s always a full-scale unit/integration/acceptance testing suite, but again the issue of billable time rears it’s ugly, but unavoidable head.
Instead, what I’ve landed on is automating that switching with a little bit of Netflix inspired chaos.
Pulse Storm Chaos
If you’re not familiar with Chaos Monkey, the
TL;DR; version is Netflix deliberately introduces code into their production system that replicates “bad things” happening to their server farm. Servers dying, slow responses, etc. This forces their engineering teams to write fault tolerant systems.
While I’d never run this code in production, I decided a similar approach might be useful for dealing with Magento’s plethora of configuration options. The end result was a simple module called Pulse Storm Chaos. (GitHub, Magento Connect Package)
The Chaos module allows you to specify random values for Magento configuration variables in a
fields.php configuration file. These values are swapped in at runtime, (the actual persistent database storage is never touched), during the pre-dispatch action controller stage.
If that doesn’t make sense, take a look at
This include file returns an array of key/value pairs. The key is the configuration node, and the value is a valid PHP callback or a method on the
pulsestorm_chaos/values model. The above configuration will, on a per request basis, randomly switch your store into flat or non-flat mode.
This may seem like a crazy way to work, but it will quickly surface any assumptions you’re making about flat vs. non-flat collections. Another possible use might be randomly swapping between themes to make sure theme specific problems are surfaced sooner rather than later.
The above code uses a PHP 5.3+ anonymous function as a callback to set the
design/theme/default value. With the above code in place, your store would swap between themes during development.
Pulse Storm Chaos won’t be for everyone, and may cause more confusion than it helps solves. However, if you’re willing to accept a little chaos in your life, this module can help you spot problems before they get to the QA team or worse, before they get to you customers.