Working with Files Containing Unprintable Characters

I was working on a file that was originally produced by a PC program (PPTIA, the PowerPoint Internet Assistant from Microsoft). I had transferred the file to my HP-UX system, and I needed to edit it (clean it up) before putting it up on our World Wide Web server. The file was not editable in Visual mode because many lines contained unprintable (non-printing ASCII) characters. These were shown as dots/periods on the Visual mode screen, and Qedit showed a question mark at the start of each line to indicate that there were questionable contents.

Question 1: What were these characters?

Question 2: How could I get rid of them?

Answer 1:

I used the Char and Decimal options of Qedit's List command to see the numeric values of the offending characters. For example:

qux/list $h $c 35
   35
 0000: 0D3C 4832 3E52 6F62 656C 6C65 2043 6F6E 7375 6C74 .<H2>Robelle Consult
 000A: 696E 6720 4C74 642E 3C2F 4832 3E20                ing Ltd.</H2>
I knew the line should have started with <H2>, but it had an extra mystery character at the start, which was shown as a dot on the right side of the List output. In the example, on the left side we can see that the character has a hex value of 0D.

Answer 2:

We can use the Change command to change the offending character to a normal printing character so that it can be edited in Visual mode, or we can use Change to remove the character by changing it to nothing. You can specify strings in the Change command by their numeric values. The numeric values must be specified in decimal, from 0 through 255. We know the hex value, 0D. Using the handy calculator built into Qedit, we see that hex 0D is decimal 13:

   qux/=$0d
   Result=13.0
Now we can change the character whose value is 13 to nothing. We'll do it in all lines of the file. First we put Qedit into "decimal mode", then we do the Change command:
   qux/set decimal on
   qux/change '13 "" all
      35     <H2>Robelle Consulting Ltd.</H2>
      36     </P>
      37     <P>  -->
   3 lines changed
There were a number of other mystery characters besides 0D/13. These I changed to other printing characters. When I had figured it all out I put together a Use file with the change commands, so that I could easily make the same change to all the files I created with PPTIA. The Use file had these lines in it:
qux/l ufix
    1     set dec on
    2     cq '9   " "     @   {HT tab}
    3     cq '11  "<BR>"  @   {VT vertical tab, new line}
    4     cq '13  ""      @   {CR}
    5     cq '145 "'"     @   {opening single quote}
    6     cq '146 "'"     @   {closing single quote, apostrophe}
    7     cq '147 \"\     @   {opening double quote}
    8     cq '148 \"\     @   {closing double quote}
    9     set dec off

Update: Another Method of Viewing Unprintables

Also: How to Edit Unprintables

....Back to the Qedit Q&A Page