CX Works

A single portal for curated, field-tested and SAP-verified expertise for your SAP Customer Experience solutions. Whether it's a new implementation, adding new features, or getting additional value from an existing deployment, get it here, at CX Works.

Catalog Synchronization

Catalog Synchronization

Catalog synchronization is the feature of publishing content from one catalog version to another. Although this is a standard capability of SAP Commerce Cloud, it is also one of the underestimated areas of a solution design that can cause complications and significant risks for project delays, if not addressed early in the project.

When multiple catalogs or even multiple business users work on content in parallel, a recommended practice is to design and prototype the catalog synchronization processes early on in the project. Failure to do this early can mean significant changes to your catalog structure might be necessary. These changes can impact back end processes (imports, exports, publications), business tools and most importantly, the storefront itself. Some of the possibilities for designing a robust and better performing catalog synchronization are covered below.

Table of Contents

How is the Catalog Synchronization Initiated?

What is the trigger for a catalog synchronization? It could either be a manual trigger by a business user (such as a business user clicks on the "synchronize" button in the backoffice product cockpit) or it could be triggered automatically (for example, by a scheduled cron job). Generally, synchronizing the whole catalog should be done through a scheduled job. The main reasons are:

Scheduled Cron Job Manual
You can easily schedule cron jobs to run during a period of low traffic and therefore low system utilization. System resources need to handle both the backoffice tools and the catalog synchronization. Running both together can create performance problems that otherwise wouldn't be an issue.
The catalog synchronization will eventually remove catalog-aware items in the active catalog version from the entity cache and evict queries from the query cache. This is better for front end performance if this happens during a period of low system utilization. Your storefront aspects will need to repopulate the cache with the catalog-aware items in the active catalog version. Depending on a number of factors, such as the number of changed items and the lack of higher level caching, performance can be negatively impacted. This should be avoided when possible.
It supports multiple parallel product cockpit users. Only one catalog synchronization job can run on a node at one time. If two business users run a catalog synchronization at the same time, the second job will be aborted without a meaningful response to explain what happened.

Generally easier to reproduce and resolve issues.

To resolve issues, you may need to focus on the way the business tools are being used by a particular user.
System administration training is required. Administrative users will need to monitor the scheduled synchronization and should be able to do basic troubleshooting. End user training is required. End users will be responsible for the synchronization process. They will need to understand how to start the synchronization process and how to verify that it completed successfully. From a business point of view, this increases the complexity of the business process.

There may be a requirement to support manual catalog synchronization. If the business is aware of the potential issues and the workflows are adjusted accordingly, the impact can be minimized. Supporting both manual and scheduled catalog synchronization, at the same time, should be avoided as it can lead to further complications and potential conflicts. To understand how to synchronize a product or catalog please see the following video.

Which Items Will be Synchronized?

Synchronizing can be done either on the entire catalog version (easiest option) or can be done on a select group of items.

There are two typical use cases for filtering the items to be synchronized:

  • There is a requirement to synchronize only approved products. The recommended practice is to avoid using the out of the box "approvalStatus" attribute as you may want to synchronize products which are not approved. The recommended practice is to introduce a new attribute "readyForSync" as a boolean type. A business user or a business logic would be responsible for setting the attribute to TRUE.  A restriction would guarantee that only confirmed products are synchronized.
  • Synchronize a subset of items (such as products of a specific category) to a specified target catalog version.

One important consideration, in an enterprise setting where there are many concurrent product managers, is the coordination of the synchronization. The "readyForSync" flag can be helpful to schedule products for the next synchronization run. However, it offers limited granularity of control in an environment where there are multiple teams managing their own lines of products. To provide a more granular coordination, it is recommended to use a hybrid approach where the "readyForSync" flag and the category-specific synchronization are used together. This way, different teams managing different brands or categories would not affect each other in unintended ways and it offers more flexibility in terms of when the synchronization can be run.


If you synchronize using a scheduled cron job then a search restriction can be used with a dedicated synchronization user:


INSERT_UPDATE SearchRestriction|code[unique=true]|name[lang=en]|query|principal(UID)|restrictedType(code)|active|generate
|Backend_Sync_Product|Sync|{item.readyForSync}=1|syncgroup|Product|true|true


Remember to allow restrictions in the session:


public class MyCatalogVersionSyncJob extends GeneratedMyCatalogVersionSyncJob
{
    @SuppressWarnings("unused")
    private final static Logger LOG = Logger.getLogger(MyCatalogVersionSyncJob.class.getName());
    @SuppressWarnings("deprecation")
    @Override
    protected SessionContext createSyncSessionContext(final SyncItemCronJob cronJob)
    {
        final SessionContext ctx = super.createSyncSessionContext(cronJob);
        ctx.setAttribute(FlexibleSearch.DISABLE_RESTRICTIONS, Boolean.FALSE); // ENABLE User restrictions
        ctx.setAttribute(FlexibleSearch.DISABLE_RESTRICTION_GROUP_INHERITANCE, Boolean.FALSE); // ENABLE GROUP restrictions
        return ctx;
    }
}


If you synchronize manually, then it can get more complicated:

  • Some jobs may abort without an informative error message if multiple business users trigger a synchronization at the same time. 
  • References to items in the same catalog version will not be synchronized unless marked as "copy by value" or "part-of".
  • Restrictions, used for scheduled cron job synchronization, do not work when synchronizing manually. Everything will be synchronized.
  • "Closely" related types may exist. Business users tend to think of "closely" related types as one type even though they are technically separate types. For example, a product consists of prices, variant products, media and so on. So what happens if those "closely" related types get modified?
    • The Product Cockpit will need an updated configuration to display when the product out of sync if a "closely" related type (such as media) gets modified. The media may work out of the box but your custom types will not. You will need to adjust the relatedReferencesTypesMap property of the synchronizationService bean. The following example shows the modification for the Product Cockpit in the accelerator code base:


      	<alias alias="synchronizationService" name="accSynchronizationService" />
      	<bean id="accSynchronizationService" class="de.hybris.platform.cockpit.services.sync.impl.SynchronizationServiceImpl" parent="defaultSynchronizationService">
      		<property name="relatedReferencesTypesMap">
      			<map merge="true">
      				<entry key="Product">
      					<list>
      						<value>Product.productImages</value>
      					</list>
      				</entry>
      				<entry key="MediaContainer">
      					<list>
      						<value>MediaContainer.medias</value>
      					</list>
      				</entry>
      			</map>
      		</property>
      	</bean>
    • If there is more than one level of "closely" related types, for example, a product variant under a product variant under a product, neither display nor synchronization will cascade down to the structure in all cases. The first success (no need to synchronize) will stop the algorithm.

In either case, make sure that all the references will be copied to the target catalog version. For example, if you synchronize a product and there is a reference to a category, that category needs to be in the target catalog version to make the link for the reference.

Catalog Merging

Just as it is possible to have a source catalog version synchronizing to multiple target catalog versions, it is also possible to have multiple source catalog versions synchronizing to a single target catalog version. In this case, unless they are updating different attributes and/or languages, the content will need to be merged from the different source catalog versions.

The simplest strategy that doesn't require any customizations, is to allow the last synchronization job run to overwrite the changes of the previous job.

This will generally work as expected if the following is true:

  • The item and all its attributes belong to only one source catalog version (such as products belong to source catalog A and categories belong to source catalog B).
  • Each catalog synchronization job synchronizes different attributes.
  • Each catalog synchronization job is configured for different language or locale.

However, sometimes you will need to merge values from the same attributes and item types. A typical example would be for the relation product -> category or category -> category:

  • The category – product relation is populated in the target catalog version (on a product item, in this example). This means that there are references from previous synchronizations, manually entered references, references added by impex or API and so on.
  • Once the synchronization runs, it takes the values from the product.supercategories attribute from the source version and replaces all the values in the target version with the values from the source version. No exception.

Different business rules for what to do when merging the values in the target version together with the content from the source versions could exist. Typically, this would append rather than overwrite the values so the values from both source versions are copied. The recommended approach is to extend and customize the Jalo based catalog synchronization job (CatalogVersionSyncJob). This business logic would need to be added to the CatalogVersionSyncCopyContext.translate(..).

An alternative option is to disable the synchronization on the type attribute where the synchronization occurs by using synchronization properties. For example:

  • The category – product relation is populated in the target catalog version (on a product item, in this example).
  • The synchronization does not take values from the product.supercategories attribute from the source version because it is not configured for synchronization.
  • The synchronization still takes the category.product attribute values and copies them into the target category item. From the category point of view, however, the values are replaced and from the product point of view, they are merged. 

This alternative solution does have some side effects, such as all the changes need to happen on the synchronized side. In our example, changes to the product – category relation must be entered on the category item. As these side effects are not intuitive and could potentially create confusion, it is safer to customize the synchronization job directly, as discussed above.

Prices 

Sometimes, prices are managed outside of SAP Commerce Cloud. The cart and order calculation, however, still occur in SAP Commerce Cloud. In this situation, it doesn't make sense to import the prices into the staged catalog version and synchronize them to the online version since the prices are managed in a separate environment. Also, this additional price synchronization:

  • Takes unnecessary space in the database
  • Considerable slows down the synchronization process
  • Marks synchronized items as unsynchronized in the case of a manual synchronization (for example, changing PriceRow marks its product as modified for synchronization purposes)
  • Forces business users to make changes to the product content quickly in order to be able to publish new prices quickly

The recommended practice is to store PriceRows independently with a loose coupling to the product (see Decoupling PDTRows from Product).

Media

Similar to prices, media are often managed outside of SAP Commerce Cloud, especially for media conversion. If a project requirement is such that media conversion is done by an external system and attributes such as URL are also managed by another system, it is much better to create a separate media catalog with just one catalog version. This way, when a product catalog is synchronized, media items are not cloned from staged version to online version. In a large product catalog, since typically there are multiple media per product, the number of media items can become huge. In such situations, having catalog-aware product media items can present significant performance challenges.

Performance

There are ways to improve catalog performance: 

  • Set the number of parallel threads to two times the number of cores in your system (and monitor performance).
  • Avoid ordered relations.
  • Avoid synchronization of media, PriceRow, DiscountRow and TaxRow if possible (see above).
  • Remove types and attributes that are not used by the project from the synchronization.
  • Make sure you have the right indexes in place on unique attributes (see Logging Database Statements to confirm).
    • CatalogVersionSyncJob.checkCatalogVersionValidity() performs duplicate checks on every root type to ensure the correct state of the catalog version before starting actual synchronization. On a large data set, this is an expensive query. If unique database indexes are in place, these duplicate checking SQL queries are redundant and create unnecessary load on the database, especially in a large catalog. In such case, default CatalogVersionSyncJob implementation can be overridden to disable duplicate checking behavior for certain root types within a particular catalog.

Part-of vs Root Type

An important part of configuring synchronization is defining the attributes that will be synchronized. Simple types are easy. Their values will be copied across to the target item. Item references that are catalog version-unaware are also easy as the primary key (PK) value of the referenced item will be copied across.

In some cases, there will be attributes that are catalog-aware item references. This means synchronization cannot simply copy the PK value because there needs to be a corresponding target Item.

Synchronization logic needs to consider these attributes so that target Items are created and correct target items are referenced. There are two ways to achieve this.

Part-of Configuration

Part-of configuration is used to make sure that any attribute of a type that is subject to synchronization (that is catalog-aware - matching items in staged and online catalog versions) is properly handled even if the target items do not exist in the target catalog. It works for partial synchronization as well as full synchronization. Part-of configuration is done by editing the SyncAttributeDescriptorConfig item for the attribute. "Copy by value" is a boolean value that determines the part-of configuration. If it is set to yes, the attribute is "part-of" the enclosing type. "Untranslatable value" which is next to it is also significant in the sense that it affects the resultant behavior. The following table summarizes the resultant behavior with different combinations of the two values.


Copy by value = yes Copy by value = no
Untranslatable value=yes

Attribute value in the target item will be cleared.

For a reference, it will be null, for a collection it will be empty.

Attribute value in the target item will point to the same PK as in the source item.

That is, the PK won't be translated to the corresponding item in the target catalog version.

For example, online products will hold the reference to media items in the staged catalog version.

Untranslatable value=no

normal part-of behavior

In order words, missing items are created in the target catalog version

Synchronization will attempt to translate the PK to the corresponding Item in the target

Catalog version if the type of the attribute is catalog-aware.  In the case where the matching Item does not exist, it will fail (that is, the attribute will not be synchronized)

Root Type Declaration

An alternative way is to define the type as one of the root types. Root types are applicable in a full catalog version synchronization context only. Synchronization will loop through each root type in the order they are defined and will make sure that any item in the source catalog version is created in the target catalog version. You should define a root type if the type is more or less an independent, stand-alone type that can exist without being referenced from another type. Refer to Synchronization of Custom Item Types for detailed instructions on how to create a new root type.

Troubleshooting Failure

A message like the one shown below is often found in the log when synchronization jobs fail. Often, this error means that for one or more catalog version-aware reference attributes could not be translated (in other words, the target item could not be found and its type was not declared as a root type and the attribute was not configured as part-of).


[CatalogVersionSyncJob] Finished synchronization in 0d 00h:00m:00s:442ms. There were errors during the synchronization!


The synchronization csv dump file can be useful to find the problematic attribute. The columns in the csv file are ordered as per the table below. The table shows that there was an error translating the "products" list attribute from the source item to the target item. Assuming that products are of the collection type, if any one of the items in the collection could not be translated, the attribute will be marked as pending and its value would not be synchronized.


Source Item PK Target Item PK itemsynctimestamp PK Pending attribute qualifier
8796093055118

8796093087886

8796093055595

product


If this isn't enough information you can:

  1. Set the configuration property cronjob.logtofile.threshold to DEBUG.
  2. Set the logging level to DEBUG in the HAC for the list of classes mentioned below: (If the classes do not appear in the HAC, run a synchronization from that node and this will load the classes and list them.)
    1. Config Property log4j.logger.de.hybris.platform.catalog.jalo.synchronization.CatalogVersionSyncWorker to DEBUG.
    2. Config Property log4j.logger.de.hybris.platform.catalog.jalo.synchronization.CatalogVersionSyncMaster to DEBUG.
    3. Config Property log4j.logger.de.hybris.platform.catalog.jalo.synchronization.AbstractItemCopyContext to DEBUG.
    4. Config Property log4j.logger.de.hybris.platform.catalog.jalo.synchronization.CatalogVersionSyncCopyContext to DEBUG.
  3. Using the catalog synchronization wizard, set the File Log level to DEBUG in the final window.
  4. If you look in the logs for your background processing nodes, you should now have more details on the issues that occurred during the synchronization, such as:


    [de.hybris.platform.servicelayer.interceptor.impl.MandatoryAttributesValidator@26263fa9]:missing values for [code] in model ApparelProductModel (8832234909114) to create a new ApparelProduct"
  5. You would then need identify all items with the same issue. For example, if you see errors about "missing values", you can run a query like (Note: replace "ApparelProduct" with the type that is throwing an error): SELECT {pk} FROM {ApparelProduct} WHERE {code} IS NULL

  6. You could then create a groovy or ImpEx script like the following to update or remove the items in question:


    REMOVE ApparelProduct;pk[unique=true]
    ;8951755421348
    ;8951755624411

    If you expect the same issue in your other environments, ensure you keep your scripts and add them to your deployment script.

  7. Re-run the synchronization. New errors will likely come up, so keep iterating through these steps until you've resolved all the issues.

Configuration and Customization to Improve Performance

Catalog synchronization performance is an issue that affects most SAP Commerce Cloud projects. This is a list of configuration and customizations that have proven to improve the performance of the synchronization. Often we're asked what's normal in terms of catalog synchronization performance (for example, items/s). It's not easy to come up with a reasonable figure because it's highly dependent on the type that is being synchronized (for example, products are much slower that PriceRows), and it's highly dependent on the data model. A data model with a lot of dependent types using relations will never perform as well for synchronization as would a simple data model. The number of supported languages and localized attributes will also impact performance. If you are averaging somewhere around 1-5 item/s per thread then you should be looking for improvements.

Threads

By increasing the number of threads in "catalog.sync.workers" property which you can increase the performance of the CatalogVersionSyncJob. Generally, you will get a linear improvement in performance until the point where the number of threads overloads either the application server or the database server. A good place to start is to ensure all your synchronization jobs are running on your background processing aspect and to allocate two times the number of threads as there are cores on the application server (for example, 8 cores = 16 threads). This is done automatically in later versions of SAP Commerce Cloud. If you notice this is overloading your server you can adjust the property to something smaller.

Transactions

Enable transactions via the "catalog.sync.enable.transactions" property. Not only does this allow the product create/update to be treated as a batch of SQL statements by the database and thereby improving performance, but it also reduces the number of database statements sent to the database. Without transactions enabled in the Jalo layer, every time you set an attribute of an item it immediately executes an update statement to the database. Obviously, you really want all attributes of a type to be set with one update statement and the transaction allows you to do that.


There is the potential for deadlocking when using transactions, therefore you will need to test the synchronization extensively to ensure stability doesn't suffer at the expense of performance. Since version 5, catalog synchronization uses the service layer by default for persistence and the same improvement can be achieved by setting "model.service.transactional.saves=true" property.


This change is global and will not apply only for catalog synchronization.


Ordered Relations

When defining the many sides of a relation, you have the option to say if it is ordered (default) or unordered. Ordered relations perform very poorly. If you have a number of these in the data model, then performance will be severely impacted. You will need to check whether every relation really does need to be ordered and if not, explicitly configure it to be unordered.

Here is an example of an unordered relation, note the ordered="false" on supplierProducts:


<relation code="Supplier2SupplierProductRel" generate="true" localized="false" autocreate="true">
	<sourceElement qualifier="supplier" type="Supplier" cardinality="one" />
	<targetElement qualifier="supplierProducts" type="SupplierProduct" cardinality="many" ordered="false"/>
</relation>


Category to Product Relation

This is the same issue as above but, as this relation is defined in a core SAP Commerce extension, you don't have the option to change to unordered in items.xml. In many projects, there is no business need to allow the products to be ordered within a category. Since the products are typically exported to a search engine (for example, Solr), you would normally lose this ordering anyway, as it will default to the Solr search profile you have configured.

If you don't need an ordered category to product relation, then you can override the standard CategoryManager. From version 5.0, there is now a way to change the ordering of existing relations by setting the following property:


relation.CategoryProductRelation.source.ordered=false


Relation Marking

Changes to one side of a relation can cause all the items to be marked as modified. For example, calling the CategoryModel.setProducts(products) can trigger an update statement for every product in the collection to update its timestamp. A performance improvement can be made by disabling the marking. However, be careful when using this feature because there could be some processes (such as catalog synchronization) that use the feature in order to know which items to include.


relation.CategoryProductRelation.markmodified=false


PriceRow Product Marking

All of the attributes of the PriceRow (PTDRow) have been overridden to update the modified time stamp of the product belonging to this PriceRow when the price row changes. If you're updating three attributes of the PriceRow, that results in three update statements to the product to set the modified time. If you're doing a full synchronization of the catalog this is unnecessary, because the PriceRow is a root type in the synchronization process. It can be synchronized independently of the product.

If marking the product is really required, you should consider using a service layer interceptor or other means instead so that you only update the product once and not for every attribute that has been changed.

To disable the price row marking in the catalog synchronization and other processes, you need to set a flag on the SessionContext. This is an example for a custom CatalogVersionSyncJob:


@Override
protected SessionContext createSyncSessionContext(SyncItemCronJob cronJob) {
	SessionContext ctx = super.createSyncSessionContext(cronJob);
	ctx.setAttribute(Europe1Constants.PDTROW_MARK_PRODUCT_MODIFIED, Boolean.FALSE);
	return ctx;
} 


Ordering of Root Types

The ordering of the synchronization's root types is very important because it prevents or at least reduces the need to do multiple passes to resolve the values that need to be set for an attribute. The fewer passes required, the faster the synchronization. This is particularly important if you have your own custom catalog item types; you should ensure that they are synchronized in the correct order in respect of the standard types to reduce the number of passes.

Remove Attributes

If you don't need some attributes in the online catalog version, then you should remove them from the synchronization job altogether. In particular, you want to remove all relation attributes that are not required because they are most likely to impact performance. Here is a list of attributes you might consider removing.

  • Item.comments
  • Item.assignedCockpitItemTemplates
  • Item.dimVals (if using Hyend 1)
  • Category.linkComponents
  • Category.productCarouselComponents
  • Category.restrictions
  • Category.productListComponents
  • Product.linkComponents
  • Product.restrictions
  • Product.productCarouselComponents
  • Product.productListComponents
  • Product.productDetailComponents
  • Product.promotions

SAP Commerce Cloud is usually efficient at checking whether an attribute has really changed before doing an update. This doesn't work properly for collection types, however, which are serialized to a column in the database. Therefore, these attributes are often updated even if there are not changed or even being used.

If these columns are not being used, then also remove them from the configuration:

  • Product.buyerIDS
  • Product.specialTreatmentClasses
  • Product.articleStatus (Note: This is different from approvalStatus, which is almost always used)

See this help page which explains how to configure which attributes are included through the backoffice. For an example of how to configure this in the project itself, see the productCatalogEditSyncDescriptors bean definition in commerceservices-spring.xml. Changing the bean definition in your project will require importing project data and it will only apply the configuration when the synchronization job is first created (for example, during initialization).

High Cardinality Between Categories and Products

If you have a high number of products assigned to categories, then you will most likely have some performance problems. This happens because the default behavior will try to load all the product PKs for a category in memory (which also causes significant thread blocking). If it's acceptable for the link between categories and products to only be created by the product itself, then you should remove the synchronization attribute configuration for Category.products.

Set Sync Languages

This is important if you're using separate catalogs for each country or if some catalogs have a subset of the overall available languages. In this case, you should set the languages of the catalog version to ensure that you only copy the localized values for the supported languages.

Reduction of Root Types

The other question to ask is, "do you need to synchronize everything in the first place?" For example, on many projects prices are coming from an external system (such as the ERP) and the price should be visible on the website as quickly as possible. If this is the case, then you could write the PriceRow directly to the online catalog version without synchronizing it. Obviously, you should consider validating the PriceRow first to make sure the PriceRow is valid.

You should also consider if there is a need to synchronize product images and other media. There can be a huge amount of media in the catalog and this media can have a very big impact on synchronization performance. In many cases, you can just avoid synchronizing media. For example, if you have a policy whereby if you need to change the image of a product, you need to create a new image with a new code rather than updating the existing image, then you could avoid using synchronization. The staged and online version of the product can point to different images so you still have an approval and publication process in place for updated product images.

Single PriceRow

Often the prices are managed on an external system, so there is no advantage of having a staged and online version in SAP Commerce Cloud. The PriceRow can be decoupled from a specific version of a product and referenced by a product code. 

Initial Attributes

The catalog synchronization process will execute at least two database statements for every creation of an item. One is the create a statement with all the initial attributes, the second is the update statement with all non-initial attributes. If you have localized attributes defined on a type, then it will be four statements in total since you have a create statement and an update statement for the localized table of the type.

For catalog synchronization, you have no control over which attributes are defined as initial. Furthermore, you need to be concerned about the initial attributes in both the catalog synchronization and also in the Jalo layer to be able to flag an attribute as initial. Setting an attribute as initial in the synchronization (isRequiredForCreation(..)) only ensures that the attribute map that is passed to the Jalo item for creation will include this attribute. In the Jalo layer, you still need to set the attribute mode to initial as you normally would when improving creation performance.

This is an example of how you can override the isRequiredForCreation(..) method to allow an attribute to be included in the ItemAttributeMap to be passed to the createItem method of the Jalo item, and therefore, eligible to be included as an initial attribute.


@Override
protected CatalogVersionSyncCopyContext createCopyContext(final CatalogVersionSyncCronJob cj, final CatalogVersionSyncWorker worker) {
	return new CustomCatalogVersionSyncCopyContext(this, cj, worker);
}
public class CustomCatalogVersionSyncCopyContext extends CatalogVersionSyncCopyContext {
	public CustomCatalogVersionSyncCopyContext(final CatalogVersionSyncJob job, final CatalogVersionSyncCronJob cronjob, final CatalogVersionSyncWorker worker) {
	   super(job, cronjob, worker);
	}
	@Override
	protected boolean isRequiredForCreation(final AttributeDescriptor ad) {
            ....
        }          
}


Extreme care should be exercised about which core attributes are added for initial creation. Part-of attributes should not be in the initial attributes nor in attributes that have had their Jalo methods overridden.

Database Performance

The catalog synchronization process fires a lot of insert/updates to the database, so you need to ensure you monitor the load and the performance of the database. If you follow the steps above, you will reduce the number of queries that are sent to the database. However, without a database that is able to process the queries very quickly, then your chance of improving the performance to a satisfactory level is significantly reduced.

There are two areas to consider. First, the load on the database (for example, CPU, memory and disk I/O). If either of these are higher than you would like, then you either don't have enough hardware for the database server or perhaps you've just increased the number of threads for the synchronization job and you should try reducing the threads and see if you get the performance you want.

The other area to look at, is the network I/O performance between the SAP Commerce Cloud background processing nodes and the database. A slow or saturated network link will cause a lot of problems for the synchronization performance due to the fact that you need to send so many queries to the server.

You can test this by using the database tools on the administration console. You can also turn on JDBC logging on the background processing aspects and look how long the queries are talking to process. For example, 0-2 milliseconds is generally quite good. If you have queries that take longer than 10 milliseconds, then generally this would point to an issue on the database server or the network link.

Addition of Indexes

The chances of finding that an index is missing in the catalog synchronization are quite small, but you should turn on the JDBC logging nonetheless. The most likely place to find a missing index is if you have defined a type with a specific unique catalog key and all attributes of the key are not included in an index.

Removal of Indexes

Database indexes can also have a negative impact on the performance of inserts and updates. You can have a database administrator analyze if all the database indexes are really required and remove the ones that are not used to improve the database performance.

Database Maintenance

If you notice that the query performance of the database is not what it should be, then it could be that the table statistics are out of date or indexes are fragmented. It's most likely that loading a large amount of data the table statistics are no longer valid for the database and you should rebuild the stats.

As an example, after importing ~500,000 products you noticed that a database query was taking 80% of the database query time (~280ms for each query). After updating the table statistics, the same query may execute in 1ms or less. This one change improved the synchronization of products from 30 items/s to 290 items/s.

If you're seeing similar issues follow the steps to open a support ticket to have application support determine if this is the root cause.

Conclusion

Although catalog synchronization may, at a high level, seem simplistic, there are many facets that can affect the design and performance of how quickly items can be synchronized. By understanding how synchronization works, how it's best executed, potential performance bottlenecks and how to identify/troubleshoot issues, you should hopefully have quicker synchronizations.