Import csv file directly into the database, is this possible?

4
Weekly I get a file with partnumbers and prices from certain suppliers. I want to import those in our system to use the most recent prices. Is there an easy way to delete all records from the database for this supplier and to import the file? There are no webservices available, it just the old fashioned way. Thanks, Edward
asked
3 answers
10

At this moment we don't have an easy way to import csv files to the database. Unfortunately, to do that you have to execute some custom java code.
First I want to make a small suggestion to your request, if you remove all the products from your database before importing them again it means that you will also lose all your associations. If you have an association from for instance an order to the product, the products will be removed and new products will be imported; the order will keep its association to the old product. Therefore it might be a better idea to update all existing products and never remove any objects.

Below I've added an example of some code I have written in the past. It can be called from an action you have created in the modeler.
CALL THE IMPORT ACTION

    public Boolean executeAction() throws Exception
    {
        // BEGIN USER CODE
        //Only add any code between the lines //BEGIN USER CODE and //END USER CODE  everything outside these lines will be removed when you deploy your modeler.

        //Create the ProductImporter class. This will initialize all variables.
        ProductImporter importer = new ProductImporter(this.getContext(), this.document.getMendixObject());
        //Now start the import
        importer.startImport();

        return true;
        // END USER CODE
    }

THIS IS AN EXAMPLE OF HOW THE IMPORT ACTION COULD WORK:

package importer;

import java.io.BufferedReader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map.Entry;

import com.mendix.core.Core;
import com.mendix.core.CoreException;
import com.mendix.systemwideinterfaces.core.IBatch;
import com.mendix.systemwideinterfaces.core.IChangeBatch;
import com.mendix.systemwideinterfaces.core.IContext;
import com.mendix.systemwideinterfaces.core.ICreateBatch;
import com.mendix.systemwideinterfaces.core.IMendixIdentifier;
import com.mendix.systemwideinterfaces.core.IMendixObject;

import moduleName.proxies.Product;
import moduleName.proxies.Customer;

public class ProductImporter {

    private IMendixObject dataFile;
    private String objectType = Product.getType();
    private HashMap<Integer, String> colMapping = new HashMap<Integer, String>();
    private IContext context;
    private static final String _separator = ";";

    private HashMap<String, IMendixObject> curObjects;
    private HashMap<String, IMendixIdentifier> customerObjects;
    private int keyColNr;
    private int nrOfColumns;
    private int refColNr;

    public ProductImporter( IContext context, IMendixObject dataFile ) throws CoreException {
        //Store the datafile in a variable, the datafile is a MxObject that inherits from FileDocument and must contain the csv
        this.dataFile = dataFile;
        this.context = context;
        //Prepare the maps where existing objects can be stored
        this.curObjects = new HashMap<String, IMendixObject>();
        this.customerObjects = new HashMap<String, IMendixIdentifier>();

        //Create a mapping from all csv columns to the Mendix attributes.
        this.keyColNr = 0;
        this.nrOfColumns = 0;
        this.colMapping.put(this.nrOfColumns++, Product.MemberNames.CustomerProductID.toString());
        this.colMapping.put(this.nrOfColumns++, Product.MemberNames.RetailerProductID.toString());
        this.colMapping.put(this.nrOfColumns++, Product.MemberNames.Description.toString());
        this.colMapping.put(this.nrOfColumns++, Product.MemberNames.EANCode.toString());
        this.colMapping.put(this.nrOfColumns++, Product.MemberNames.Price.toString());
        this.refColNr = this.nrOfColumns++;


        /*
         * Retrieve all existing objects and place them in a map with the key column(sapnr) as key of the map and the object as the value
         * This way the objects can be retrieved later on.
         */
        List<IMendixObject> curObjects = Core.retrieveXPathQuery(this.context, "//" + this.objectType);
        for( IMendixObject object : curObjects ) {
            IMendixObject customer = Core.retrieveId(this.context, (IMendixIdentifier) object.getValue(Product.MemberNames.Product_Customer.toString()));
            this.curObjects.put(customer.getValue(Customer.MemberNames.SAPNumber.toString()) + (String)object.getValue( this.colMapping.get(this.keyColNr) ), object);
        }

        //Get all customers and put their id in a map so an association can be set to that customer.
        List<IMendixObject> customerObjects = Core.retrieveXPathQuery(this.context, "//" + Customer.getType());
        for( IMendixObject object : customerObjects ) {
            if( (String)object.getValue( Customer.MemberNames.SAPNumber.toString()) != null )
                this.customerObjects.put((String)object.getValue( Customer.MemberNames.SAPNumber.toString()), object.getId());
        }
    }


    public void startImport() throws CoreException {
        //Try to retrieve the conten from the uploaded file
        InputStream stream = Core.getFileDocumentContent(this.dataFile);
        if( stream == null ) {
            throw new CoreException( "There is no file uploaded.");
        }

        //Put the file stream in a buffered reader so the file can be read line by line
        BufferedReader reader = new BufferedReader(new InputStreamReader(stream));  
        String line = null; 
        try { 

            //Create two batches so all objects can be send to the database in batches.  
            ICreateBatch createBatch = Core.createBatch(this.context, this.objectType, 200, false, false);  // this function expects( the current context,  the type of objects to store in this batch,  nr of objects to commit in one batch,  execute validation rules, auto commit after X objects)
            IChangeBatch changeBatch = Core.changeBatch(this.context, new ArrayList<IMendixObject>(), 200, false, false);  // this function expects( the current context,  a pre defined list containing all objects that must be committed as well,  nr of objects to commit in one batch,  execute validation rules, auto commit after X objects)

            //Prepare two variables to raise the auto nr of my product object and to keep track of the current line we are reading.
            int lineNr = 0;
            Long autoNr = null;
            while ((line = reader.readLine()) != null) {
                try {
                    lineNr++;
                    //Split the current line by the separator. 
                    String[] lineParts = line.split(_separator);

                    //Are the total nr of parts found in this line the same as the expected amount
                    if( lineParts.length >= nrOfColumns ) {
                        String customerId = processValue(lineParts[this.refColNr]);
                        IBatch curBatch;

                        //If the key of this object already exists. the object has to be changed. So add that object to the change batch
                        if( this.curObjects.containsKey( customerId + processValue(lineParts[this.keyColNr]))  ) {
                            curBatch = changeBatch;
                            changeBatch.next(this.curObjects.get(customerId + processValue(lineParts[this.keyColNr])) );
                        }

                        //The object doesn't exist so prepare a new record in the createbatch
                        else {
                            curBatch = createBatch;
                            createBatch.create();
                            //If the autonr is not set before get the current autonr value
                            if( autoNr == null ) 
                                autoNr = (Long) createBatch.getMember(Product.MemberNames.MendixProductID.toString()).getValue();


                            //Rease the auto nr and store the value of the new value in the new object
                            autoNr++;
                            createBatch.setMemberValue(Product.MemberNames.MendixProductID.toString(), autoNr);
                        }


                        //For each pre defined column mapping, add the value to the attribute
                        for( Entry<Integer, String> mapping : this.colMapping.entrySet() ) {
                            curBatch.setMemberValue( mapping.getValue(), processValue( lineParts[mapping.getKey()] ) );
                        }

                        /*
                         * To add new mapping values use this function.
                         * curBatch.setMemberValue( "AttributeName", processValue( lineParts[5 *Or any other column number*] ) );
                         */


                        //If the customer key from the csv exists in the database set the association. If it doesn't exist throw an exception
                        if( this.customerObjects.containsKey( customerId ) ) {
                            curBatch.setMemberValue(Product.MemberNames.Product_Customer.toString(), this.customerObjects.get(customerId));
                        }
                        else {
                            throw new CoreException( "Aborting the import because no customer could be found with number: " + customerId );
                        }
                    }
                    else {
                        throw new CoreException( "Aborting the import because line nr: " + lineNr + " has not enough columns, there are only " + lineParts.length + " columns found but there must be " + this.nrOfColumns + " columns." );
                    }
                }
                //Catch CoreExceptions separatly and re-throw them so any possible stacktraces won't be so huge
                catch (CoreException e) {
                    throw e;
                }
                catch (Exception e) {
                    throw new CoreException(e);
                }
            }

            //To make sure that there are no uncommitted objects remaining commit both batches.
            createBatch.commit();
            changeBatch.commit();
        }
        //Catch CoreExceptions separatly and re-throw them so any possible stacktraces won't be so huge
        catch (CoreException e) {
            throw e;
        }
        catch (Exception e) {
            throw new CoreException(e);
        }
        finally {
            //Clear both object maps, so java won't keep them in the memory
            this.customerObjects.clear();
            this.curObjects.clear();
        }
    }


    /**
     *  Process the value, this means it the value has any starting or trailing spaces remove them. 
     *   Or if the value starts or ends with an " remove that as well.
     */
    private String processValue( String value ) {
        value = value.trim();
        if( value.startsWith("\"") ) {
            value = value.substring(1);
        }
        if( value.endsWith("\"") ) {
            value = value.substring(0, value.length()-1);
        }

        //Remove the spaces that might have been between the qoutes
        return value.trim();
    }
}
answered
1

I have implemented a similar solution to the ProductImporter above however I was experiencing Out of Memory exceptions if the number of records in the table in which the import was populating grew to a too significant number. The following piece of code is the cause of the problem:

        /*
         * Retrieve all existing objects and place them in a map with the key column(sapnr) as key of the map and the object as the value
         * This way the objects can be retrieved later on.
         */
        List<IMendixObject> curObjects = Core.retrieveXPathQuery(this.context, "//" + this.objectType);
        for( IMendixObject object : curObjects ) {
            IMendixObject customer = Core.retrieveId(this.context, (IMendixIdentifier) object.getValue(Product.MemberNames.Product_Customer.toString()));
            this.curObjects.put(customer.getValue(Customer.MemberNames.SAPNumber.toString()) + (String)object.getValue( this.colMapping.get(this.keyColNr) ), object);
        }

The issue here is the first line of code which sends an unbounded query to the database and then holds a list of those objects in memory. The garbage collection will pick up the list once the constructor has finished executing however when a certain number of objects have been retrieved (this number will depend on their complexity, number of attributes, associations etc.. & your JVM memory settings) the JVM will run out of memory, and the constructor won't finish.

To combat this issue I removed the offending code and the map of identifiers entirely, and replaced the following:

//If the key of this object already exists. the object has to be changed. So add that object to the change batch
if( this.curObjects.containsKey( customerId + processValue(lineParts[this.keyColNr]))  ) {
     curBatch = changeBatch;
     changeBatch.next(this.curObjects.get(customerId + processValue(lineParts[this.keyColNr])) );
}

With a bounded xpath query returning all objects with the member value concerned equal to the key in the previously used map, in this case it would be:

Core.retrieveXPathQuery(this.context, "//" + this.objectType ++ "[SAPNumber = '" + customerId + processValue(lineParts[this.keyColNr]  + "']");

We then check if the object should be newly created or updated by executing the size() method on the List. This method prevents the unnecessary instantiation of every IMendixObject that we are not concerned with, therefore speeding up the import and preventing Out of Memory exceptions. The only reason I can see for holding the map of identifiers in memory is too reduce the number of interactions between the application and database, however I have tested my solution against the original one with varying table sizes and even in cases where the table is very small making the size of the list retrieved negligible there is still no performance gain when mapping the identifiers in memory.

answered
0

In Ollie's approach above, since the bounded xpath query is in effect submitting repeated select queries to the same database table, the DB server will typically be reading the data from its buffer cache so does not need to issue a separate physical read for each query. So in effect the data is being held in memory by the DB server rather than the JVM.

answered