Notes from an RxWorks conversion

This is the beginning of a set of notes from my experiences in doing a conversion from RxWorks to OpenVPMS.  This is not intended to be a polished document, but rather a set of notes on things I have come across.  I will start this as one page, but as time goes on, I expect it will be restructured into a more accessible and logical form.

First some background: I know a lot about RxWorks and am familar with traditional relations databases, but this is the first time I have played with an entity-relationship database.  The system I am converting runs 24/7 and hence I need a conversion process which is either fast enough to be done in say 3 hours, or can be pre-converted and then use catch-up/synchronise processing.

 

Dataload or Kettle plugin?

In case you didn't realise there are two ways of loading data into the OpenVPMS database, the dataload utility (which processes XML formatted data), or the OpenVPMS Loader plugin for Kettle.

The dataload utility has its place, and I use it to load a practice-tailored setup file that contains all the basic data including lookups, postcodes, etc etc.

However dataload does not have the performance required to load a large dataset.  For this you need the Kettle plugin.  (In my enviroment, Kettle can load 14000 customer records in 2 minutes, dataload needs an hour.)

Data conditioning

OpenVPMS provides very tight data validation through the use of lookup tables.  RxWorks is far looser, and although it has lists for various items, it is not mandatory that the data entered matches whats in the lists.  This means that any conversion from RxWorks needs to do a reasonable amount of data cleanup work.  Fortunately Kettle is a good fast powerful tool for doing this. (The 2 minutes for 14000 customer records includes all the processing to correct the suburb names [found in either the RxWorks ADDRESS3 or ADDRESS2 slots], the titles, the account types, etc etc.)  

SQL snippets

Because of the entity-relationship design, SQL queries are not as simple as with a traditional database.  Put another way, it's not obvious how you have the write the query.  Hence this collection of snippets.

SQL to show lookups
select lookup_id, name, code from lookups where arch_short_name = "lookup.suburb";
select lookup_id, name, code, arch_short_name from lookups where arch_short_name like "lookup.%";

Differences

Below are a number of differences that may need to be considered during conversion.

Appointment duration:  RxWorks uses a start-time, duration setup; OPV uses start-time, end-time.  Hence in general one calculates the end-time by adding the duration to the start-time.  However, RxWorks will not let the appointment cross the midnight boundary, but OPV will.  Hence to mimic RxWorks, you need to terminate any appointment that ends in the next day at midnight.

Negative invoices:  RxWorks allows the total invoice amount to be negative, OPV does not.  Hence, its tempting to simply convert these negative invoices into credits.  However, if you need to link clinical event line items to invoices, then you cannot do this if the invoice containing the item has become a credit.  The solution is to split the RxWorks negative invoice into a credit holding the negative items (which will normally be a discount line item) and an invoice holding the other items.

Standard labels:  RxWorks has a concept of standard labels, OPV does not.  Hence your product conversion code needs to generate specific label text for those products which use a standard label.

No temperature:  RxWorks has fields for the patient temperature and weight; OPV has only weight.  So unless you want to invent a temperature archetype and the support for it, you need to record the RxWorks temperatures in the Visit Notes field.  You also need to educate the staff to record the temperature in the Visit notes. 

Notes overflow:  RxWorks uses a memo field (ie a huge text field) to hold visit notes; OPV provides 5000 characters.  Hence your conversion code needs to be able to split long RxWorks visit notes in multiple OPV visit notes.

Summary/Reason: Rxworks also uses memo fields for other things, notably the visit summary; OPV's visit Reason field is 50 characters long.  My solution was to put the first 50 characters in the Reason field, and if some were lost, to put the whole summary in the first line of the visit notes prefixed by "Summary:".

Other overflows:  In the case of other overflows (such as the RxWorks Transaction Text going in the OPV Invoice Notes), I just discarded the excess - on the basis that although it is critical to keep the all of the medical history, the financial stuff is less important and can afford to lose text from the end of the invoice notes.

Discard old data:  It is quite reasonable to discard some old information.  In my case we discarded all counter sales over two years old, all quotes/estimates over 12 months old, all appoinment data over two years old, and all data pertaining to clients and patients whose records had been deleted.  (RxWorks does not have full referential integrity and hence in some tables there will be records containing patients and/or clients who have been deleted.)

Database size:  In these days of terabyte disks, one really does not have to worry about space occupied by the database.  However the OPV database is much larger that the RxWorks database.  Firstly the OPV database structure requires more space (I suspect about 50% more) for everything except the attachments.  However, since in OPV, the attachments live inside the database, this adds hugely to the size.  Moreover, most of the attachements are incompressible jpeg files.

This leads us to the offsite backup problem.  Currently a zipped copy of the RxWorks database is FTP'ed offsite each night (around 150MB worth).  The 5GB of attachments are copied to a removable disk once a week and this is held offsite.

With 5GB of attachments inside the OPV database, the nightly FTP is not going to work, and I do not currently have a solution to this problem. [Although the site is in Hong Kong where it is common to gigabit bandwidth available, in our our building, the uplink only runs at some 150Kbps.]

Conversion strategy

As I indicated at the beginning I need a conversion strategy that will work with a 24/7 business environment.  I now know that we should be able to convert the system with about a 3 hour data-entry lockout.  Hence the plan is as follows:

  • shut down RxWorks and make a copy of the database
  • restart RxWorks telling the night staff that they can use it to look up patient records, but that if they do any data entry, this will be lost
  • convert everything except the >2year old visit and financial data (takes under 3 hrs)
  • bring OPV up and tell the staff that they can do anything except look at >2year old records, and all attachments
  • initiate the conversion of the >2year old data (takes about 2 hours)
  • start genbalance running (takes about 45 minutes), and docload running (takes ???)

Note that implicit in the above is the fact that one can use OPV while the various load utilities (ie Kettle, genbalance, and docload) are running.

 

That's it for the moment.  I may add more later and I will come back and fill in the ???? bits.

Syndicate content