Question

Master data management & concurrency

8

Hello, I'd like to hear your opinions/suggestions/solutions/thoughts about the following: In nearly all Mendix applications, you'll encounter some master data. These are non-transactional data entities references by the transactional data of the application. Examples are entities like countries, articles, employees, etc. For objects like these it is important that no duplicates ever are created (multiple "The Netherlands" country objects for example should not be possible). When the application is accessed by a limited amount of users using the Mendix client, the chance that two users try to create the same entity at exactly the same time is very small. In these cases setting the microflow property disallow concurrent execution to true and showing an error message to the user is sufficient: the user can just try again when this occurs. However, when the application has a lot of concurrent users, or is integrated with automated processes that communicate the data real-time, the chance for concurrent creation becomes large enough to require a better solution. Especially in the case of automated processes, you cannot just return a concurrent execution error and make the other party responsible for the error handling. A good example case would be something like this: a Mendix application handles orders for products and keeps track of the stock levels of these products. Orders and stock levels can be entered manually in the Mendix web client, or can arrive from other systems using web services. Both order and stock level messages reference products: if they don't exist yet, they have to be created. This is done using the "find; if not found, create" setting in the XML mapping that handles the web service messages. Since web services are (inherently) executed concurrently (and in this case there are even multiple web services that use the same entity), this creates the very real possibility that the "find; if not found, create" part of one of the XML mappings will be executed concurrently, resulting in duplicate objects. So my question is, how would you handle this in a Mendix project?

asked 2010-01-29

Alexander Willemsen

1 answers

Michel Vermeer · Answer 1 · 2010-01-31

Hi Alexander,

if the products that are used are subjective to some sort of standard you could make sure that their ID is only stored once in you Mendix database. For example, all car parts have their own unique identifier that could support prevention from dublicate records.

If the product are not subjective to any standard then such a problem is hard to tackle. I am currently working on a CRM application that will be used almost 24 hours per day by a large group of users so we will need some form of solution for this as well.

We will use two things for this:

Data stewards; their main goal is to make sure that all data is of the right quality. We will support them by creating 'merge' functionality. This functionality will allow the users to merge the records of e.g. 'Jan Jansen' and 'Jan Janssen' if they turn out to be the same person
CUCU reports; reports to ensure that all data is consistent, unique, correct and up-to-date. These reports should give the customer insight in all records with certain data missing (e.g. all persons without a last name, or without a primary e-maill adress). We've also found Java code online that 'detects' double records within certain boundaries (e.g. Jan Janssen will be detected as a possible duplicate of Jan Jansen, but John Jensen not, you can tweak how strict these checks are)

So in general not something we can avoid completely but something to take in account when developing an application