Question

XPath: multiple over-association OR-constraints

1

Hello, In a Mendix webinar from 2014 , optimization of not()- and or-constraints is being discussed in XPath retrieves. It is mentioned that it is advisable to create separate DB-retrieves for each over-association OR-constraint, and use a Union-action to combine the lists. I found this very interesting and created a small experiment. I created a very simple domain model: As for data: I created 25.000 Main-objects, and 500 Associated1 and Associated2 objects (each of which is associated with 50 Main-objects). Next, I created 2 microflows that measure the duration of the different retrieves. The first microflow had 1 retrieve with the following XPath constraint: The second microflow had the same constraints, but each of them was put in a separate retrieve action. A Union-List action was added as a third action in this microflow. Both of these flows were run 2.500 times, and the average duration for each option was calculated. The end result was that the combined retrieve took an average of 13ms, and the separate retrieves combined with the union action took an average of 3ms. This is a very large difference, which really surprises me. My question is whether I perhaps forgot other influences which might cause these differences in the way I performed my test, and second, if anyone else has any knowledge about splitting retrieves this way, or has experience using this method. Thanks in advance, and sorry for the long read. Greetings Martin

asked 2021-04-21

Martin Leppen

2 answers

Nico Mouton · Answer 1 · 2021-04-21

Hi Martin,

Check out this learning path: https://academy.mendix.com/link/path/62/Constrain-Your-Data-Using-XPath. I’m sure you will find it insightful and it is sure to answer your questions.

This method works, in my experience. Logically, the more rows in your table, the greater gains you will see.

Joost Winkens · Answer 2 · 2021-04-21

Hi Martin,

If you have some knowledge regarding queries and their performance (I don’t have a lot of knowledge on this topic), then it might help you to look at the actual queries.

I rebuilt your testcase in my project, looking at the queries of your two scenarios it shows that they are really different.

First scenario (one query):
'SELECT Test.Main/* FROM Test.Main WHERE Test.Main.ID IN (SELECT x1Test.Main.ID FROM Test.Main AS x1Test.Main WHERE (x1Test.Main.ID IN (SELECT x2Test.Main.ID FROM Test.Main AS x2Test.Main WHERE x2Test.Main.ID = x1Test.Main.ID AND x2Test.Main/Test.Main_Associated1 = 27021597764222977) OR x1Test.Main.ID IN (SELECT x3Test.Main.ID FROM Test.Main AS x3Test.Main WHERE x3Test.Main.ID = x1Test.Main.ID AND x3Test.Main/Test.Main_Associated2 = 27303072740933633)))'

Second scenario (two queries):
'SELECT Test.Main/* FROM Test.Main WHERE Test.Main.ID IN (SELECT x1Test.Main.ID FROM Test.Main AS x1Test.Main WHERE x1Test.Main/Test.Main_Associated1 = 27021597764222977)'

'SELECT Test.Main/* FROM Test.Main WHERE Test.Main.ID IN (SELECT x1Test.Main.ID FROM Test.Main AS x1Test.Main WHERE x1Test.Main/Test.Main_Associated2 = 27303072740933633)'

I’m hoping that this is of any help.

Kind regards,

Joost