Are there best practices for large-scale delete operations?

0
What's the trick to doing large-scale, mass deletions?For various reasons, I regularly need to delete ALL instances of two entities overnight -- one has 70,000 records, and the other has 800,000. I have learned that Mendix does NOT like pulling these all into a list to delete them. My instance is regularly running out of memory.At this point, I have a microflow that deletes them in batches. It does 1,000 at time, and I have a workflow instance that orchestrates this -- it calls the microflow over and over again until a counter on the workflow context reaches zero. This still results in warnings about database memory when the job runs.One of the entities is linked to the other, so I have to delete one of them first, but I wouldn't have any problem just doing this in raw SQL, if the Mendix Cloud would allow that (I can do this locally -- it's gloriously simple). There are no event handlers bound to them.(Honestly, I basically need to wipe out the entire database -- I delete EVERY persistable entity, once per day.)What about a Java Action? I've considered doing an batched ID lookup in OQL, then deleting the entities from Java. I feel like this would be lower memory pressure, since it's not actually populating the entity. But it's a LOT of single deletes to the database. (If only OQL allowed deletes...)Is there some form of mass deletion available? Is there a better way to do this?
asked
7 answers
0

Hey yeah this is a common pain point. Batch deleting is the right approach but your batch size and method matter a lot.


A Java action is definitely the way to go here. You can use Core.createXPathQuery to retrieve batches of objects and then Core.delete on each batch, all within the same Java action in a loop. The key is to keep batch sizes reasonable (something like 1000-5000) and commit/clear the context between batches so the runtime doesnt hold all those objects in memory. Something like a while loop that retrieves a batch, deletes it, and repeats until the batch comes back empty. This avoids the overhead of passing objects back to the microflow layer each iteration.


If you really want to go nuclear and there are no before/after delete handlers and no delete rules you care about, you can execute raw SQL from a Java action using JDBC. You can get a database connection from the runtime and just run a TRUNCATE or DELETE statement directly. Something like `Core.dataStorage().executeWithConnection(conn -> { conn.prepareStatement("TRUNCATE TABLE mymodule$myentity CASCADE").executeUpdate(); return null; })`. This bypasses all Mendix logic obviously so you need to be careful, but for wiping entire tables overnight with no event handlers its ridiculously fast compared to doing it through the ORM. Ran into this exact scenario on a project where we had nightly data refreshes and going through the normal Mendix delete path for hundreds of thousands of records was just not practical.


One thing to watch out for with the JDBC approach: make sure you handle the association tables too if you have many-to-many relationships, and be aware the table names in the database are lowercased versions of the module and entity names with a dollar sign separator.

answered
0

Does the Mendix Cloud allow raw SQL? I had that running fine locally, but the Cloud threw an error when I tried it.

answered
0

This did work, thank you. My problem is that the tables are named differently on my local machine and production.


Local: "PUBLIC"."module$table" (with the quotes)

Cloud: module$table


So, what works on one, doesn't work on the other (which is why I got an error previously). I'll need to add some logic to vary the table names based on a constant, set per environment.


But, yes -- the crux of the code you provided works, and has really helped. Thanks again.

answered
0

I found this...

String tableName = Core.getDatabaseTableName(Core.getMetaObject(this.EntityType));

But it still returns the wrong table name.

answered
0

You do not need a Workflow for this. A Scheduled Event is the correct approach for a nightly cleanup job.


The key issue is not batching itself — you are already doing that. The real problem is that even though you retrieve in batches, the delete operations are still executed within the same long-running transaction. Because of that, Mendix keeps the transaction open and memory usage grows, which results in database memory warnings.


Here is the recommended solution:

Create a batch structure inside a Scheduled Event.

Retrieve the objects in chunks (for example 3,000 at a time).

After each retrieve, immediately perform the delete.


Most importantly:

After each delete operation, call CommunityCommons → EndTransaction.


Without EndTransaction, Mendix still accumulates the deletions in the same transaction scope. By explicitly ending the transaction after every batch, the platform commits the changes to the database in smaller chunks (e.g. 3,000 records at a time) and releases memory before continuing.


This way:

  • Each batch is committed separately
  • Memory pressure is reduced
  • Database warnings disappear
  • No Java action or raw SQL is required



This is the cleanest and most stable solution for large-scale deletions in Mendix Cloud.


If this resolves the issue, please close the topic.


answered
0

Firstly, you cannot use raw SQL queries inside a Java Action or via the Execute Query option in Mendix Cloud. This is because you would need direct access to the database URL, username, and password. Without these credentials, it is not possible to perform direct deletion activities on the Mendix Cloud database. Mendix Cloud does not allow direct database access for security and platform integrity reasons.

Secondly, OQL (Object Query Language) is designed only for retrieving data from the database. You cannot use OQL to delete, update, or insert records. Based on the query, OQL creates objects in memory at runtime and returns the requested data, but it does not support data manipulation operations like DELETE or UPDATE.

The best option for large-scale deletion is to use a Scheduled Event. You can configure the scheduled event to run at a time when application usage is minimal (for example, overnight).

Inside the scheduled event, implement a batch deletion process according to Mendix best practices. Normally, a batch size of 1,000 objects is recommended. Since you are deleting objects (and not processing heavy logic), you may increase the batch size to 2,000 objects, if performance allows.

For each manual iteration of the batch, use the Start Transaction and End Transaction Java actions available in the Community Commons module. By controlling transactions manually, you can reduce memory pressure on the server. In Mendix, memory usage increases during large transactions. By committing and closing transactions per batch, you ensure that memory is released incrementally instead of being held until the entire process completes.

Using Scheduled Events, combined with controlled batch processing and transaction management, is the safest and most scalable approach for mass deletion in Mendix Cloud.

answered
0

Hi,


Large-scale delete operations must be handled carefully in Mendix because the platform is not designed to retrieve and delete hundreds of thousands of objects in a single transaction. Doing so can lead to memory issues, timeouts, or crashes.

Why Your Approach Causes Problems

When you:

  • Retrieve all 70k / 800k objects at once
  • Delete them in one big list

Mendix tries to load all objects into memory before deleting them. This can exhaust the JVM heap and slow down dramatically.

Even batching inside a loop can still cause memory pressure if you hold lists across iterations.

Recommended Best Practices

1. Delete in Small Batches Using a Loop (Best Mendix Practice)

Do not retrieve the entire table at once.

Instead use:

Microflow pattern:


Repeat until no more records found:
   Retrieve next 1000 records (limit 1000)
   Delete list
   Commit

This limits memory footprint and completes the transaction in smaller chunks.

Example logic:

  • Retrieve list of next batch (e.g., 500–2000)
  • Delete that list
  • Commit
  • Loop until empty

This avoids loading huge lists.

2. Use Scheduled Events / Queues

Instead of a single long-running microflow, split across multiple jobs:

  • Scheduled event picks next batch
  • ProcessQueue module can execute jobs sequentially
  • Each batch is a separate transaction

This reduces memory leaks and allows concurrency control.

3. Use Database-Level Truncate/Delete via JDBC (Fastest)

If you truly need to wipe out entire tables and you are not dependent on Mendix delete behavior (events, associations, business logic), the fastest approach is:

  • Use executeWithConnection from the Mendix DataStorage API
  • Run SQL DELETE or TRUNCATE TABLE

Example from the community shows SQL running much faster than batch delete in microflow.

Important: this bypasses delete behavior (no Mendix event handlers), so only use when safe.

4. Avoid Pulling Entire List into Memory

Mendix recommends not retrieving more than a few thousand objects at a time, because doing so can exhaust memory.

If you must retrieve data:

  • Use Range in retrieve (limit)
  • Paginate with offset/limit

This ensures smaller footprint.


  • Delete behavior cascades delete; it can trigger many sub-deletes which slows down performance if done in bulk. Consider manually deleting child objects first.
  • If objects have associations, delete bottom-up instead of relying on cascade.
  • Using a flag (IsDeleted) pattern and filtering out flagged objects can be a good alternative temporary solution before actual delete.


answered