So I have this XML file.
<?xml version="1.0" encoding="UTF-8"?>
<CPR:eCPR xmlns:CPR="http://www.google.com/">
<CPR:contractorInfo>
<CPR:contractorName>Company Name</CPR:contractorName>
</CPR:contractorInfo>
<CPR:projectInfo>
<CPR:projectLocation>EARTH</CPR:projectLocation>
</CPR:projectInfo>
<CPR:payrollInfo>
<CPR:statementOfNP>false</CPR:statementOfNP>
<CPR:employees>
<CPR:employee>
<CPR:name id="1111:First One">First One</CPR:name>
</CPR:employee>
<CPR:employee>
<CPR:name id="2222:Second Two">Second Two</CPR:name>
</CPR:employee>
<CPR:employee>
<CPR:name id="3333:Third Three">Third Three</CPR:name>
</CPR:employee>
<CPR:employee>
<CPR:name id="4444:Fourth Four">Fourth Four</CPR:name>
</CPR:employee>
<CPR:employee>
<CPR:name id="5555:Fifth Five">Fifth Five</CPR:name>
</CPR:employee>
</CPR:employees>
</CPR:payrollInfo>
</CPR:eCPR>
I need to split it, so each file will have an "n" employee. for example, if I need each file to have 2 employees then there will be 3 files with the last file have only one employee while keeping every the rest of the tag.
What I want (file1)
<?xml version="1.0" encoding="UTF-8"?>
<CPR:eCPR xmlns:CPR="http://www.google.com/">
<CPR:contractorInfo>
<CPR:contractorName>Company Name</CPR:contractorName>
</CPR:contractorInfo>
<CPR:projectInfo>
<CPR:projectLocation>EARTH</CPR:projectLocation>
</CPR:projectInfo>
<CPR:payrollInfo>
<CPR:statementOfNP>false</CPR:statementOfNP>
<CPR:employees>
<CPR:employee>
<CPR:name id="1111:First One">First One</CPR:name>
</CPR:employee>
<CPR:employee>
<CPR:name id="2222:Second Two">Second Two</CPR:name>
</CPR:employee>
</CPR:employees>
</CPR:payrollInfo>
</CPR:eCPR>
Here is what I did so far
$limit = 2
$logpath = "C:\dev\project\doximity\temp.xml"
[xml]$xml = Get-Content $logpath
$nsm = New-Object System.Xml.XmlNamespaceManager($xml.NameTable)
$nsm.AddNamespace("CPR", "http://www.google.com/")
$index = 1
$ref = New-Object Xml.XmlDocument
$ref.XmlResolver = $null
$rows = $xml.SelectNodes("//CPR:employee", $nsm)
$c = $rows.Count
$rows | ForEach-Object {
if ($index -eq 1) {
$InsertNode = $ref.CreateElement("CPR", "employees", "http://www.google.com/")
$InsertNode.InnerXml = ""
$ref.AppendChild($InsertNode)
}
$ref.DocumentElement.AppendChild($ref.ImportNode($_, $true))
$c--
if ($index -eq $limit) {
$index = 1
$ref.Save("C:\dev\project\doximity\chunck{0:D3}.xml" -f ++$i)
$ref = New-Object Xml.XmlDocument
$ref.XmlResolver = $null
if ($c -lt $limit) { $limit = $c }
} else {
$index++
}
}
And the output is
<CPR:employees xmlns:CPR="http://www.google.com/">
<CPR:employee>
<CPR:name id="1111:First One">First One</CPR:name>
</CPR:employee>
<CPR:employee>
<CPR:name id="2222:Second Two">Second Two</CPR:name>
</CPR:employee>
</CPR:employees>
What am I missing?
jdweng's helpful answer shows a solution based on LINQ-to-XML (System.Xml.Linq.XDocument
).
Here's a streamlined formulation of your own [xml]
(System.Xml.XmlDocument
)-based attempt:
# Create an XML DOM and load its content from a file.
# NOTE: Be sure to use a *full path*, because .NET's working dir.
# usually differs from PowerShell's
$xml = [xml]::new() # shorter and more efficient alternative to New-Object Xml.XmlDocument
$xml.Load("C:\dev\project\doximity\temp.xml")
# Determine the parent element of interest, using *dot notation*.
$empsRootElement = $xml.eCPR.payrollInfo.employees
# Get all child nodes (elements) as an array.
$empsElements = @($empsRootElement.ChildNodes)
# Determine the chunk size and the number of chunks.
$n = 2
$chunks = [math]::Ceiling($empsElements.Count / $n)
# Process each chunk.
foreach ($i in 0..($chunks-1)) {
# Remove all child nodes.
$empsRootElement.RemoveAll()
# Add the next chunk as the (only) children.
$empsElements[($i*$n)..(($i+1)*$n-1)].
ForEach({ $null = $empsRootElement.AppendChild($_) })
# Save the chunk to a sequence-numbered file.
$xmlDoc.Save(("C:\dev\project\doximity\chunk{0:D3}.xml" -f (1+$i)))
}