Powershell, SVN Log, and Encodings

To automate packaging changes for our system, we use Powershell in conjunction with the default SVN client to extract relevant commits and then export all files therein to a package directory. After upgrading from SVN 1.6.12 to 1.8.1, however, I have noticed a weird encoding compatibility problem with SVN output and Powershell: the umlauts in filenames that were read from the SVN log became garbled in the process and the subsequent SVN exports stumbled over “filename not found” errors.

A typical error I encountered involved filenames like “Schätzung Thementöpfe.xlsx” turning into “Schõtzung Thement÷pfe.xlsx” when stored in a Powershell variable. For additional nastiness, piping SVN output directly to the host console displayed the umlauts correctly–the encoding problem only showed up when I stored it in a variable for later use.

So I went searching the net and didn’t really find a solution. From this Stack Overflow query, however, I learned that, for whatever reason, SVN output apparently comes in the CP850 encoding. Then, the MSN article on System.Text.Encoding and some wild experimentation revealed to me how to convert encodings, and the final solution was rather elegant:

$SvnEncoding = [System.Text.Encoding]::GetEncoding("CP850")
$revlog = & svn.exe log --revision $Revision --verbose --username $SvnUser --password $SvnPassword 2>&1
foreach($rawline in $revlog) {
  $line = [System.Text.Encoding]::Default.GetString( $SvnEncoding.GetBytes($rawline) )
  # ...
}

Since CP850 is a rather exotic code page, I have to use the static method GetEncoding() to instantiate an object of it. Then, I store the SVN Log output in a variable and iterate over its lines. Each “raw” line of the output is converted by the CP850 encoding object to a byte array, which is then converted into a string using the system default encoding, for which the Encoding class helpfully provides a shortcut. The resulting string can then be stored, processed, and used in normal Powershell code as necessary.

I hope that saves you some googling. Enjoy. 🙂

This entry was posted in powershell, svn. Bookmark the permalink.

2 Responses to Powershell, SVN Log, and Encodings

  1. Matt says:

    Saved me time — thank you!

Comments are closed.