While profiling a product indexing process, it turned out that a big percentage of the time needed to crawl products was spent in calls of getCustomAttribute(). This method is typically used to access additional product attributes which might have been created in the Magento admin area. We managed to cut runtime by 40% by replacing getCustomAttribute() with getData() in our Solr module.

Custom Attributes

As described in the official Magento Developer Documentation, Custom Attributes are non-standard attributes which can be created in the admin area of Magento 2 or via code by a developer. The official way to access those attributes and their values with PHP are the methods getCustomAttribute() and getCustomAttributes(). In the case of products, these methods are so-called API methods which means that their use is officially recommended and will also be possible in future versions of Magento.

Our Findings

Astonished by the finding that the call of getCustomAttribute() cost us about 40% of performance during indexing, we decided to create a little benchmark. We installed Magento OpenSource 2.2.2 with the official sample data and created a CLI command. This command traverses all products in the database in a foreach loop, measuring the elapsed time – once with and once without a call of $product->getCustomAttributes() in the loop.

Those are the results on my development machine:

Without call of $product->getCustomAttributes(): 2046 products traversed in 1.15s. With call of $product->getCustomAttributes(): 2046 products traversed in 34.31s

If you want to take a look at the exact code: You can find it as a Gist on GitHub. Also, the whole simple module is available for download. Extract the code into app/code/IntegerNet/CustomAttributeTest/, run bin/magento setup:upgrade and then bin/magento productcollection:test.

As you can see, the call of getCustomAttributes() is very expensive. It doesn’t matter if we only access one attribute or many, as every call to getCustomAttribute() loads all custom attributes if the method has not been called before, no matter if the attributes have already been loaded from the collection.

Workaround

My workaround is to replace the function call with $product->getData() as follows. This function retrieves all custom attribute values as well, plus the default attributes, and doesn’t use an extra database lookup.

(Yes, that extra check is needed when using getCustomAttribute() because it returns null if the custom attribute cannot be loaded. Not a good design decision in my eyes.)
The drawback is that $product->getData() isn’t included in any API annotation.

Side Notes

Some further findings:

  • It doesn’t matter if Flat Tables are activated or not; performance stays roughly the same.
  • The relative performance gain remains the same with developer mode or production mode. In production mode, both loops take about half the time.
  • We haven’t tested performance for other entities like customers or categories yet.

Research and a future fix

Investigating the issue more deeply, we found out that the cause of the performance problem is how the custom attributes are handled internally. For every product, an attribute collection is being loaded, containing all attributes and causing an additional database query. By moving that functionality to the corresponding Resource Model, my colleague Fabian managed to cache this attribute collection so it will be loaded only once per request. He put his optimization into a pull request so the performance of product loops with getCustomAttribute() calls will hopefully be improved a lot in future releases of Magento 2.

Andreas von Studnitz

Author: Andreas von Studnitz

Andreas von Studnitz is a Magento developer and one of the Managing Directors at integer_net. His main areas of interest are backend development, Magento consulting and giving developer trainings. He is a Magento 2 Certified Professional Developer Plus and holds several other Magento certifications for both Magento 1 and Magento 2. Andreas was selected as a Magento Master in 2019 and 2020.

More Information · Twitter · GitHub · LinkedIn