Removing BOM(Byte order marker) characters from your Aspx Pages

The Problem


I recently had a painful problem. I had written a python script to parse aspx files and add a user control, replace some controls with other ones and do a huge batch operation. Everything worked ok, until I started seeing a funny sequence of characters in my pages when I was debugging: 

After googling those exact character I discovered that this is a BOM character. A Byte Order Marker character is used in Unicode files to denote the byte order. As in: is it little Indian… err endian or big endian. Python obviously did this when editing the files. The really neat thing is that Visual Studio did not display these characters. Instead if I could find the invisible character and delete it, Visual Studio would stall for a few seconds and when I save the file it would bitch about the encoding changing and source control. But this was in the lucky cases. In most cases tracking down the visible character was difficult, and Visual Studio didn’t want to delete it. I left gaps of no characters between tags, and that’s where it was. Yes visual studio is an idiot I guess.

The Solution



Notepad2. A freeware editor that has no R&D department behind it, no intellisense, and no really fancy features. Today it was better than visual studio. Opening the offending file in Notepad2 I could already see the character encoding in the status bar. UTF-8. Choose File->Encoding->ANSI and then File->Save and the BOM suddenly shows up, but as a question mark(?). Delete the character and voila!

If you want to get notepad2 you can grab it at : http://www.flos-freeware.ch/ .

Comments