de_DEus

Product Import with Magento

A comparison of Import Methods

There are various methods to import products into Magento, methods which I compared in this article. This article is based on the lecture Importing Products into Magento given at the Meet Magento NL 2014. In this respect I compared the methods who are already integrated in Magento, with the most popular external modules.

The methods are compared in terms of operation, their functionality and their performance. Thus, the import won’t support all types of products from all methods, it is in many cases of advantage if the imported products are automatically indexed and don’t have to run all the indexes after each partial import for all the included products in the shop.

Basically, there are various approaches for imports:

  • Externally controlled: the product data that typically comes from an external system, can be prepared there or in a middleware, like Magento expects it. Afterwards you will then import the products using one of the methods listed below, through either a CSV file or via a web service.
  • Magento controlled: the import is controlled via a Magento module (or script). The module is responsible to collect the data from the source, convert it into the appropriate format (often a PHP array) and then pass it to the corresponding import method. This module can be either a project-specific module or, for example, the one developed by Paul Hachmang, Ho_Import.

It mainly depends on the projects specifics and on the preferences of the responsible developer, when choosing the approach and the import module. An experienced developer/consultant can assist you when taking the decision as well when making the implementation so that you exclude an improper approach from the outset.

Import methods

Mage::getModel

Products and other types of data can be managed within the Magento framework as models with the command Mage :: getModel(…). Moreover, they can be loaded, modified, saved and deleted. These internal methods offer the full flexibility of all the functionalities in Magento and not just for the product import. They can be used in a separate module or script and can offer a reliable import with all the functionalities in Magento. The major disadvantage of this method is that it is relatively slow, especially for product data, which is quite complex in Magento.

Soap / XmlRpc

It’s about the integrated API methods in Magento, whereby one can for example create products from another server. This is not only limited to imports, but it also includes the creation and reading of categories, customers, orders, invoices, etc., which are the most important, but not all Magento functionalities. All available functions are listed at http://www.magentocommerce.com/api/soap/introduction.html. The main disadvantage is the limited performance.

REST

REST is the web service interface introduced with the Magento version 1.7 (Community Edition). Authorization is provided by Oauth, which gives you access to REST. Product and customer data can be created and edited, otherwise it has read-only access. Details can be found at http://www.magentocommerce.com/api/rest/introduction.html. The main problems are the limited functionality but also the limited performance.

Dataflow

Dataflow is the “old” way to import and export products and customer data into Magento. It works with a specific format of CSV files. However, it is not simple to implement this method in your own workflow and it suffers from very poor performance. This method is considered to be outdated, but is still used occasionally.

ImportExport

ImportExport is a module which is integrated into Magento since version 1.5. It has become the successor of Dataflow and can import and export customer and product data much faster. ImportExport is also based on the reading of q special format of CSV files. You can find a documentation about that format at http://en.integer-net.de/2012/04/04/importing-products-with-the-import-export-interface/. Compared to Dataflow, ImportExport is a considerable progress, but still has the disadvantage that because of the commitment to CSV files it is difficult to include it in your own automated workflow. Another disadvantage is that it doesn’t support indexing, that means that after importing the indexes must be rebuilt always.

FastSimpleImport

Initiated by the author of this article, the module is based on ImportExport and is available as a free, open source module at https://github.com/avstudnitz/AvS_FastSimpleImport . It allows you to provide data in the form of a PHP array which consequently makes it possible to integrated into your own workflow. Opposite from ImportExport some other functions were upgraded such as importing categories or partial indexing. The drawback of this module is, as for ImportExport, the complicated, not particularly intuitive data format, whereat minor tweaks over the standard module have been made.

ApiImport

ApiImport has a similar concept as FastSimpleImport and is also based on ImportExport. It was developed by the dutch developer Daniël Sloof and available free of charge at https://github.com/danslo/ApiImport. It offers most of the features that FastSimpleImport does as well as access via SOAP or XmlRpc and accelerates the import via web services significantly.

CustomImportExport

The free Magento module by Antonio Martinez (available at http://www.magentocommerce.com/magento-connect/amartinez-customimportexport.html) follows the same approach as FastSimpleImport and ApiImport by using and extending the use of the Magento default import „ImportExport“. It has a sophisticated command line interfaces, accepts however only CSV files for import. Apparently, the module is no longer actively developed, last being released in 2012.

Magmi

Magmi is a free import tool developed by the french developer Sébastien Bracquemont, available at http://sourceforge.net/projects/magmi/ It is an independent Magento PHP tool with its own surface, which directly accesses the Magento database for an import. Extensions are available through a plugin system. It imports CSV files in the Dataflow format by default, however it allows a faster import and access to advanced features. Through the so-called Datapump functionality you can also import products via PHP scripts. Unfortunately the module does not support all indexes on the fly, thus a reindex must be initiated separately after each import.

uRapidFlow® und uRapidFlow® Pro

uRapidFlow is the only commercial module in the list. It was developed by Boris Gurvich, one of the masterminds behind Magento, who became independent with his company Unirgy 2008. The simple version of the importer is available at http://www.unirgy.com/products/urapidflow/ for $ 120 and the pro version costs $ 670 – with this you can import categories and additional product types. uRapidFlow is a very flexible Magento module, which enables various imports with the help of profiles. The main drawback of this module is that parts of the source code are encrypted with the ionCube PHP module.

Comparison of the features

The key features of each import methods, are represented in the following table:

In Magento integrated Partial Indexing Web Service Simple Products Config. Products Grouped Products Bundle Products Downloadable Products Categories Customers
Mage::getModel X X X X X X X X X
Soap / XmlRpc X X X 1 X 1 X X
REST X X X X
DataFlow X 2 2 X
ImportExport X X X X X
FastSimpleImport X X X X X X X
ApiImport X X X X X X X X
CustomImportExport X X X X
Magmi X3 X X X X
Urapidflow X X
Urapidflow Pro X X X X X X

1 Free module available

2 Commercial module available

3 Only URL-Rewrite-Index and Category-Product-Index via a free plugin performance

Comparison of the performance

We have measured the performance of all the methods through a small import framework written exactly for this purpose. It is freely available at https://bitbucket.org/integer_net/integernet_importtest and loads the necessary import module with the Magento-Composer-Installer. It is controlled via command line and it gives as a result the measured import times:

Shell-Eexecution of the Import Test

Shell-Eexecution of the Import Test

  • Before each import all products are deleted in order to have a clean database.
  • We have measured the various import methods without and with partial indexing, for up to 50,000 simple products with the necessary required attributes.
  • For the URL-Index, we use the free open source module EcomDev_UrlRewrite, because it offers significantly better performance than the standard index and it supports both ApiImport and FastSimpleImport directly .

Results without indexing (in seconds)

Number of items 5 50 500 5.000 50.000
Mage::getModel 1,21 9,46 89,37 867,09
Soap / XmlRpc 37,16 317,37 3147,2
REST 8,11 83,99 862,89
DataFlow 2,4 12,84 124,43 1228,7
ImportExport 0,75 0,94 3,84 34,9 369,87
FastSimpleImport 0,66 1,02 4,46 37,36 370,55
ApiImport 1,2 0,69 5 39,72 655,83
ApiImport (via XmlRpc) 2,53 2,78 5,38 223,83
CostomImportExport 7,59 7,45 11,59 50,62 429,87
Magmi 0,26 2,06 24,3 194,96 1960,8
uRapidFlow1 0,15 0,36 2,66 24,77 240,46

1 Estimation, as direct test wasn’t possible

Results with partial indexing (in seconds)

Number of items 5 50 500 5.000 50.000
Mage::getModel 5,22 51,11 631,48 20523
Soap / XmlRpc
REST
DataFlow
ImportExport
FastSimpleImport 1,58 1,88 5,69 44,38 434,27
ApiImport 1,2 0,69 5 39,72 655,83
ApiImport (via XmlRpc) 2,53 2,78 5,38 223,83
CostomImportExport
Magmi1 0,24 2,36 23,88 239,55 3486,7
uRapidFlow2 2,22 10,55 100,57 1189,3 10830

1 Not all indexes are supported

2 Not tested, but supported

Diagram with the results

Diagram with the results

Diagram with the results

You can see very clearly that there are large differences in speed between the following groups:

  • Methods that are based on Mage_ImportExport (ImportExport itself, FastSimpleImport, ApiImport) and uRapidFlow, which is even faster
  • Other external methods (Magmi)
  • Other built-in methods in Magento – specially the import via SOAP is extremely slow and needed in our test cases, five seconds per product – without indexing, which still have to be followed.

The obtained results are available exactly for the server being used in a specific configuration. Nevertheless, they show information about the relative performance of the different methods among each other.

Conclusion and personal recommendation

Personally I use the FastSimpleImport in almost all cases because it’s the best in terms of performance and additional functions (and of course, also because I know it best). ApiImport and Magmi are the following equivalent solutions that can be used without hesitation, although I personally don’t like the approach of Magmi as an external tool with direct access to the database. You might also be interested in  “uRapidFlow” which showed the highest speed. You have to pay for that (but you also receive great support for that), but it’s main disadvantage is the encoded source code.

When taking the decision regarding which import method should be used a higher weight should be put on the functionality and operability of the module rather than on the absolute performance, which depends on many factors and settings of the module. Thus in some cases, it might be wise to use Mage :: getModel for the product import, despite its low performance. For many other applications, such as the import of attributes, attribute sets, groups of customers, pricing rules, etc. this is necessary either way.

Should an access via the Magento API be necessary, I would use ApiImport because imports of products, categories and customers via SOAP or XML -RPC are considerably accelerated in comparison to Magento default. Imports are very project specific and offer several possibilities for optimization, independant of the particular import module, . Usually, the pure import time is only a small part of the total needed runtime if there is still product images that needs to be downloaded or data to be enriched or reprocessed. Therefore, this article can only offer an overview of the import methods and doesn’t replace expert advice or personal experience.

I am looking forward for your comments, questions and suggestions.

Andreas von Studnitz

Author: Andreas von Studnitz

Andreas von Studnitz is a Magento developer and CEO of integer_net. His main areas of interest are interface development, backend development, Magento consulting and giving developer trainings. He is a Magento Certified Developer since 2011 and a Magento Certified Solution Specialist since 2014.

More Information · Twitter · GitHub

This Post has 13 Comments

  1. Peter says:

    Can you tell us something about the hardware environment?
    I need some rough calculation what to expect on a specific hardware.

    Btw: nice job, good to read.

  2. Peter says:

    Wow, then these numbers are great.
    Now I have to see how can I do a fast import with attributes …

    thankyou

  3. Petar says:

    Nice article, I would like to do the tests too and was wondering if the import data you used is included with the test framework?
    If its not, can you share it?
    I am working with uRapidFlow module and have not seen so low speeds, and wanted to check if there is a specific case I am missing.

  4. Hi Andreas,

    The uRapidFlow numbers don’t make any sense. Average numbers we see are 100+ products per second. Would you mind sharing your test environment?

    Best regards,
    Boris.

    • Import data is included in the test suite. It’s a bit tricky to set up due to the different modules included – might be I configured something wrong. If someone finds an error in the testsuite, I’ll happily post a correction.

      • There are 2 configurations for reindexing. one general in Import Options, and for specific indexers in Reindex tab. Perhaps you just didn’t add the one in the Reindex tab, without changing default value in Import Options.

  5. Although the results for uRapidFlow remained low on our test server and neither we nor Boris (Unirgy) could find out why, I have updated the results to use numbers provided by Unirgy, adjusted to the general speed of our test server to keep comparability. With these numbers, uRapidFlow is the fastest import method in the test.

Leave a comment