Handling increasingly large amount of data via scheduled event
Hi all, We have some processing being done in our app in the following way: there are multiple entities with a reasonable amount of attributes and we collect the info from these entities and prepare it in a single table ( let’s call it Table T ) with a huge amount of attributes in order to be exposed to another app via OData. This processing is being done via scheduled events running nightly where we delete the old Table T from the DB and start a flow where we collect all the latest and updated data from the smaller entities and put it in the new Table T. This is quite time consuming and resource consuming with the DB CPU Utilization frequently reaching above 95% utilization. The problem is the data we collect will continue to get larger and larger and the info we need to put into Table T will become bigger with time. What would be the best ways to optimize the collection of data process and what other ideas do you have for optimizing this whole concept?
Is it an idea that instead of deleting and rebuilding you merely update exisiting records?
i.e. if current != database = update attributes; if current=database: skip.
With ever increasing amounts of data it will take a while regardless but try and filter out any unnecessary steps. The above suggestion is a bit over simplified but you might as well iterate over the existing table/entity and update the records that need updating and explicitly committing those in a seperate list you create instead of deleting and rebuilding everything. That and potentially running the SE a few times sequentially to break the load if you have any way to parse out records already updated by the SE that night.
Ofcourse, if none of this applies or doesnt suit the needs then my apologies.