Kettle loader - possible performance improvement suggestion

In my RxWorks conversion, I have adopted a methodology of first loading the entities/acts and then creating the required links. [eg for the consult records, load all the act.patientClinicalEvent data, load all the act.patientWeight data, then load the actRelationship.patientClinicalEventItem data to link them.

For the first two, I need the etl_log records so that I have the required mappings for the third.

However, for the third transform (which generates the links) I will never need the etl_log data.

Currently there is no way to tell the loader "I don't need the etl_log records - don't write them [and I do realise that the 'skip processed' facility requires the etl_log records]".

Suggestion - change the ID Field Name to be optional rather than mandatory.  If no field name is entered then nothing is written to the etl_log.

Regards, Tim

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Re: Kettle loader - possible performance improvement suggestion

The etl_log data is required to avoid generating duplicates if a row has already been processed.

So while it would improve performance, if you re-run the transforms you will end up with duplicate relationships.

-Tim (A)

Syndicate content