XML mapping performance

9
I'm importing some largish XML files (about 2 MB each) and have noticed the following: Using PostgreSQL as the database, the mapping of one XML file takes about 35 (!) minutes Mapping that exact same file while running the project on the build-in database takes just 1.5 minutes It is to be expected that the build-in database performs better, but not by this much. This, coupled with the observation that the mapping on PostgreSQL slows down the longer it runs, seems to indicate some sort of buffering issue (something to do with transactions?). So I also tested mapping the same file split in 1152 separate files (each element that is the starting point of the mapping in a separate XML file). Results of this mapping are: Build-in: total mapping time increased by a factor 2 (to 3 minutes). Somewhat slower because of the extra overhead of running 1152 mappings instead of just 1, but still a very acceptable speed. PostgreSQL: total mapping time decreased by a factor 10 (to 3.3 minutes). Just a bit slower than when using the build-in database, which is to be expected, but (despite the extra overhead of running multiple mappings) much faster than the first scenario. So my question: why the dismal performance on PostgreSQL when mapping large XML documents? Is there any way to improve the performance, or is the XML mapping not intended to be used for these amounts of data?
asked
1 answers
10

Hi Alexander,

We have also noticed performance differences on larger files, albeit less extreme than your scenario. In the upcoming 2.5 release we have done a major update to the XML mapper resulting in constant performance on big XML files (both CPU and memory).

answered