Sometimes we need to clean the XML data by removing duplicate tags(elements, or child elements. So, how can we remove? The answer is XSLT. So, how to remove duplicate tags with XSLT from the XML?
So here I’m taking a very simple input XML as a sample.
Input XML:
Click on the download button for sample input
Table of Contents
ToggleCheck duplicate tag
Now If we analyze the above XML input, we can see there two places where xml tags are duplicates.
First duplicate tags
The <detail> tag containing multiple ‘A’ tag with same value ‘akkk’, so here distinct tag should be one <A>
Second duplicate tags
The last <address> tag is the exact same as second last <address> tag with same child elements and their values, here these two tags also dupliates. We need only one tag.
120 Ridge
MA
01760
120 Ridge
MA
01760
Remove duplicate tags with XSLT
So here I’ve written a very simple XSLT which copies all elements, removes duplicate child elements with the same name and string value within the same parent. Preserves both the tag name and value correctly.
So If we run above XSLT with above input, it will generate below the input as mentioned.
Now, if we analyse the above-generated output by the XSLT, we can see that all duplicate tags, either child, have been removed properly.