I have found this question and answer already, and together they provide most of the answer to my problem, in format entirely intelligible to a complete novice!
I have an additional query to hopefully fill in some of the gaps in both my process and my understanding:
I have a series of XML files each containing all the information for a single person, instead of a single CSV containing multiple people, as the salient input. This information requires splitting and manipulation, so I have multiple streams in my Kettle transformation to reflect this (and will have a 'for each file' loop set up in the parent Job to handle multiple files).
To use the method outlined in the selected answer to import my processed data from a person file into a database, do I need to recombine my many processed datastreams into several streams within which all the data to be joined resides, or is there an alternate way of approaching this?
i.e. If I have datastreams A, B, C, D & E in my Kettle transformation, and within my database A is joined to B & C, and D to E, do I necessarily need to combine streams A, B & C into one stream and D & E into another?
Thanks in advance
Update: I have now solved this issue. To do so involved (as many problems) coming at it from a different direction, but I'm adding my solution to possibly aid a similarly unskilled Kettle user:
I used the operator outlined in the linked question (combined insert/update) to insert/update the data into my 'outermost' tables for each file (streamed using the StAX operator), my schema being much akin to a spider's web, with a nodal 'core' table of unique information for each file, branching off into tables where I aim to store single incidences of data common to many files, e.g. country.
The data streams in my transformation now being in the database, I join streams incrementally (e.g. place+city+country streams after processing and insertion are related to generate a location entity in a location table) producing the requisite relationships. This eventually results in producing join tables from e.g. my location table to the 'core' table, such that all the disparate data for a given XML file is appropriately connected.