Loading...

xsl-list@lists.mulberrytech.com

[Prev] Thread [Next]  |  [Prev] Date [Next]

RE: [xsl] Merging two sets of files Emma Burrows Tue Apr 03 08:02:34 2012

In fact the outputs are as follows:

- a ditamap and a set of standalone DITA topic files based on all the records 
in book.xml except drug records with catalogue entries
- standalone documents from book.xml for drug records that don't have 
corresponding drug information
- a new version of each of the catalogue files that has a corresponding 
book.xml record

What I didn't mention is that I already have a process to convert book.xml into 
a ditamap and create all the standalone documents. This is tried and tested and 
I am not messing with it two weeks before delivery. :)

The merging is a new, late requirement - I was hoping to just bolt it on to the 
existing transform by hooking it into the drug record matching template, but 
maybe that's not sensible. I am investigating doing a completely separate 
transform.


-----Original Message-----
From: Emmanuel Bégué [mailto:[EMAIL PROTECTED]
Sent: 03 April 2012 10:50
To: [EMAIL PROTECTED]
Subject: Re: [xsl] Merging two sets of files

Do I understand your requirements correctly -- you need to output
- a new version of book.xml with associated catalog information from the drug 
database
- standalone documents from book.xml for topics that don't have corresponding 
drug info
- a new version of each of the 10,000 for which information can be found in 
book.xml (not sure about that last part -- requirements 3 and
4?)

I would build two temporary documents:
- a new book.xml with all associated catalog information
- a new big catalog file (from all the relevant little drug files), with all 
associated book information

and then in a second pass, cut those big documents to output the required 
result files as needed.

One way to do two passes with Saxon is to use saxon:next-in-chain (variables 
are really painful to use).

Hope this helps.
Regards,
EB


2012/4/3 Emma Burrows <[EMAIL PROTECTED]>:
> I'm currently using XSLT 2.0 (using Saxon 9.3 via Oxygen 12) to merge two 
> sets of XML files together based on a third file which is a kind of lookup 
> table. However, I'm coming across a problem when I need to effectively merge 
> two source files into the same output file, and I need some suggestions on a 
> change of approach.
>
> I have the following XML files:
>
> - Main document - let's call it book.xml This contains various types
> of topics, including about 4000 topics related to drugs, each identified by a 
> unique id.
>
> - Ancillary drug information files auto-generated from an online drug 
> database.
> These are about 10,000 little XML files, each named after the unique id of 
> the drug information in the online catalogue.
>
> - An XML file - let's call it lookup.xml - that is essentially a look-up 
> table, matching ids in book.xml to one or more drug catalogue ids, and vice 
> versa. However, not all records in book.xml have an entry in lookup.xml.
>
> Now my requirement is to convert book.xml from its current proprietary format 
> into a DITA-based specialisation, and while I'm doing that:
>
> 1- Output the records with no corresponding catalogue entry as standalone 
> documents.
>
> 2- Merge each drug record in book.xml that has catalogue entries with the 
> corresponding auto-generated catalogue file(s), based on lookup.xml.
>
> 3- If a record in book.xml has more than one catalogue id in lookup.xml, I 
> need to copy the book.xml record into every one of the corresponding 
> auto-generated files.
>
> 4- If more than one record in book.xml corresponds to one catalogue id in 
> lookup.xml, I need to merge all the book.xml records with that same catalogue 
> file.
>
> 5- Make sure the converted and merged files are referenced in the correct 
> location in the book's hierarchy.
>
> I expect we'll ultimately do something more sensible like use conref rather 
> than tamper with the auto-generated files, but merging them is my current 
> brief as it stands.
>
> Point 4 is the immediate stumbling block because my solution to fulfilling 
> points 2 and 3 was as follows:
>
> 1. Convert the book.xml drug record into the desired DITA format and place 
> that in a variable.
> I'm doing this based on a matched template, so this happens whenever the 
> processor "encounters" a drug record as it travels book.xml. This ensures 
> that I can export records with no catalogue id and keep track of where the 
> record was in the hierarchy.
>
> 2. Use the lookup.xml file to find corresponding catalogue ids for that 
> record.
>
> 3. For each catalogue id, open the corresponding catalogue file using 
> document(), and result-document it to a new file with the contents of the 
> variable inserted in the XML.
>
> The problem is that in step 3, I can't reopen a document that was previously 
> created by the transform, so I can't "add" a new book.xml record to the 
> contents of an already generated catalogue file, even by outputting a new 
> file with a different name.
>
> I can see that I'll probably need a process with an intermediate step, 
> perhaps using lookup.xml to guide the processing so I can group records with 
> the same catalogue id. But the only trouble with that is what to do with 
> records that don't appear in lookup.xml...
>
> Anyway, I hope all this is clear and I'm open to ideas. :)
>
>
> --~------------------------------------------------------------------
> XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
> or e-mail: <mailto:[EMAIL PROTECTED]>
> --~--
>

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:[EMAIL PROTECTED]>
--~--


______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com 
______________________________________________________________________

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:[EMAIL PROTECTED]>
--~--