Strangely typeset backspaces

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Strangely typeset backspaces

#1 Post by Liviu » 06 May 2013 20:27

Text files containing literal BACKSPACE characters (a.k.a. ^H, code 0x08) display strangely when type'd in the console. The same "type" command, however, saves a bit identical copy when redirected to a file.

This is one test case input file, say asc-bs.txt. It is made of three sections which are identical except the top line in the first section has one less backspace.

Code: Select all

[[**]]

[[**◘◘◘◘]]
[[***◘]]
[[***◘◘]]
[[***◘◘◘]]
[[***◘◘◘◘]]

[[**◘◘◘◘◘]]
[[***◘]]
[[***◘◘]]
[[***◘◘◘]]
[[***◘◘◘◘]]

[[**◘◘◘◘◘]]
[[***◘]]
[[***◘◘]]
[[***◘◘◘]]
[[***◘◘◘◘]]
Note that the character displayed above as "◘" is a literal backspace in the actual .txt file.

Following is the ouput at a cmd prompt in XP Sp3.

Code: Select all

C:\tmp>type asc-bs.txt >typed-asc-bs.txt

C:\tmp>fc /b asc-bs.txt typed-asc-bs.txt
Comparing files asc-bs.txt and TYPED-ASC-BS.TXT
FC: no differences encountered

C:\tmp>type asc-bs.txt
[[**]]

]]**
[[**]]
[]]**
[[]]*
[]]**

]]**
[[**]]
[[*]]
[[]]*
]]***

]]**
[[**]]
[[]]*
[[]]*
[]]**
What's of note is that (a) type'ing the file while redirecting the output to another file creates an identical copy, and (b) type'ing it at the console gives three different outputs for what would be expected to be identical lines.

FWIW, replacing the ASCII '*' character in the test file with the control character "♦" (^D, code 0x04) gives yet another/different set of (inconsistent) results.

Code: Select all

C:\tmp>type ctrl-bs.txt
[[♦♦]]

]]♦♦
[[♦♦]]
[]]♦♦
[[]]♦
]]♦♦♦

]]♦♦
[[♦]]
[]]♦♦
[[]]♦
]]♦♦♦

]]♦♦
[[♦]]
[[]]♦
]]♦♦♦
[]]♦♦

Liviu

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Strangely typeset backspaces

#2 Post by foxidrive » 06 May 2013 23:27

Even with the three section exactly repeated - it displays differently.

Code: Select all

[[**]]

]]**
[[*]]
[]]**
[[]]*
[]]**

]]**
[[**]]
[[*]]
[[]]*
]]***

]]**
[[**]]
[[*]]
[[]]*
]]***

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: Strangely typeset backspaces

#3 Post by Liviu » 07 May 2013 12:34

foxidrive wrote:Even with the three section exactly repeated - it displays differently.

Indeed. This re-confirms that backspaces on one line can affect the display on _following_ lines, which is the strange thing in all this.

One simple example proves the point quite clearly. Save two files, say asc-bs-1.txt and asc-bs-2.txt

Code: Select all

123◘◘+
456◘+

Code: Select all

123◘◘◘+
456◘+
where the latter has one extra "◘" backspace on the top line. Then:

Code: Select all

C:\tmp>type asc-bs-1.txt
1+3
45+

C:\tmp>type asc-bs-2.txt
+23
4+6

The first line typed in either case is correct (noting that ^H moves the cursor to the left, but does not actually delete or overwrite characters already written). However, the second line types differently for the second file, with no apparent rhyme or reason, and this just because the line above it had one extra ^H backspace added. Quite odd.

Liviu

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: Strangely typeset backspaces

#4 Post by dbenham » 19 May 2013 18:05

:? :shock: :twisted:
That is just crazy. I can't fathom what could possibly be going on behind the scene that could give that result.

I see behavior like that and I wonder how CMD.EXE ever gives a reliable result.


Dave Benham

aGerman
Expert
Posts: 4743
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Strangely typeset backspaces

#5 Post by aGerman » 20 May 2013 05:15

It's definitely a C issue. To be honest I don't understand that behaviour but I'm able to reproduce at least different outputs depending on how the text was saved or read. The text was taken from Livius initial post.

I used a C program to display the following Output:

Code: Select all

1 ~~~~~~~~~~~~~~~
[[**]]

]]**
[[**]]
[[*]]
[[]]*
[]]**

]]**
[[**]]
[[*]]
[[]]*
[]]**

]]**
[[**]]
[[*]]
[[]]*
[]]**

2 ~~~~~~~~~~~~~~~
[[**]]

]]**
[[**]]
[]]**
[[]]*
[]]**

]]**
[[*]]
[]]**
[[]]*
[]]**

]]**
[[*]]
[[*]]
[[]]*
[]]**

3 ~~~~~~~~~~~~~~~
[[**]]

]]**
[[**]]
[]]**
]]***
[]]**

]]**
[[*]]
[[*]]
[[]]*
[]]**

]]**
[[**]]
[]]**
[[]]*
[]]**

~~~~~~~~~~~~~~~~~

While in the first block the output comes line by line from an array ("lines") I used a continuous string ("all") in the second block.
In the third block I read the content out from a text file ("test.txt") (a short view into the cmd.exe file indicates that the WinAPI functions where used instead of standard C methodes).

Strange enough if you redirect the output of the program into a text file then block 1 and 2 are identical and mirror the text that was hard coded in the program unless Lf becomes CrLf (that's OK). Block 3 also mirrors the text in "test.txt" unless CrLf becomes CrCrLf (which again is the normal behaviour using printf()).


Perhaps a bit off topic, but it might help to understand - the C code that I used:

Code: Select all

#include <windows.h>
#include <stdio.h>

int main()
{
  int i = 0;
  HANDLE hFile = INVALID_HANDLE_VALUE;
  DWORD  dwBytesRead = 0;
  char buffer[1024] = {0};

  const char * const lines[19] = {
    "[[**]]",
    "",
    "[[**\b\b\b\b]]",
    "[[***\b]]",
    "[[***\b\b]]",
    "[[***\b\b\b]]",
    "[[***\b\b\b\b]]",
    "",
    "[[**\b\b\b\b\b]]",
    "[[***\b]]",
    "[[***\b\b]]",
    "[[***\b\b\b]]",
    "[[***\b\b\b\b]]",
    "",
    "[[**\b\b\b\b\b]]",
    "[[***\b]]",
    "[[***\b\b]]",
    "[[***\b\b\b]]",
    "[[***\b\b\b\b]]"
  };

  const char * all = "\
[[**]]\n\
\n\
[[**\b\b\b\b]]\n\
[[***\b]]\n\
[[***\b\b]]\n\
[[***\b\b\b]]\n\
[[***\b\b\b\b]]\n\
\n\
[[**\b\b\b\b\b]]\n\
[[***\b]]\n\
[[***\b\b]]\n\
[[***\b\b\b]]\n\
[[***\b\b\b\b]]\n\
\n\
[[**\b\b\b\b\b]]\n\
[[***\b]]\n\
[[***\b\b]]\n\
[[***\b\b\b]]\n\
[[***\b\b\b\b]]";

  puts("\n1 ~~~~~~~~~~~~~~~");

  for (; i < 19; ++i)
    printf("%s\n", lines[i]);

  puts("\n2 ~~~~~~~~~~~~~~~");

  printf(all);

  puts("\n\n3 ~~~~~~~~~~~~~~~");

  hFile = CreateFile("test.txt", GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
  if (hFile != INVALID_HANDLE_VALUE)
  {
    if (ReadFile(hFile, buffer, 1023, &dwBytesRead, NULL))
    {
      buffer[dwBytesRead] = 0;
      printf(buffer);
    }
    CloseHandle(hFile);
  }

  puts("\n\n~~~~~~~~~~~~~~~~~");

  return 0;
}

For better understanding some escape sequences I used in the code:
- \b back space character
- \n new line character (actually a single Lf that becomes CrLf in the outputted stream).
- trailing \ escapes the new line in the C code (has nothing to do with the content of the string).

Regards
aGerman

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: Strangely typeset backspaces

#6 Post by Liviu » 20 May 2013 22:55

aGerman wrote:It's definitely a C issue.

Actually, worse than that - looks like it's a WriteConsole API issue. Difference being that a C issue would be likely confined to a particular compiler and/or runtime library, while the API affects any application using the console, including cmd itself.

A small test run off my second/shorter example

Code: Select all

#include <windows.h>

static const char *const szLines[] =
{
   "\n123\b\b+\n456\b+\n",
   "\n123\b\b\b+\n456\b+\n",
};

int main()
{
   HANDLE hOut; int nLine; DWORD dwOut;
   hOut = GetStdHandle(STD_OUTPUT_HANDLE);

   for(nLine = 0; nLine < sizeof(szLines) / sizeof(szLines[0]); nLine++)
      WriteConsole(hOut, szLines[nLine], strlen(szLines[nLine]), &dwOut, NULL);

   return 0;
}
matches the 'type' output, with the same odd behavior that an extra "\b" backspace on the first line affects the display of the next one

Code: Select all

1+3
45+

+23
4+6

aGerman wrote:To be honest I don't understand that behaviour

Second that ;-)

Liviu

Post Reply