display wchar_t string and wstring in gdb

When you print a wchar_t string or std::wstring while debugging C or C++ in gdb you don't get to view the content of the string. At first I painstakingly casted the pointers at each offset to check the values, but then I decided there had to be a better way. I didn't find anything as simple as turning on an option, but you can script it in gdb as follows:

define wc_print
echo "
set $c = (wchar_t*)$arg0
while ( *$c )
  if ( *$c > 0x7f )
    printf "[%x]", *$c
  else
    printf "%c", *$c
  end
  set $c++
end
echo "\n
end

Then you can just type "wc <wide_string_variable_name>" (the wc shortform for wc_print should work as long as there are no other gdb commands starting with wc). The non-ASCII characters will be displayed as hex in square brackets.

This works for a wchar_t pointer, and even an STL wstring. I am guessing you don't need to call the c_str() member of the wstring because the first (and only? see below) data member of the wstring (._M_dataplus._M_p) is the wchar_t pointer.

printf %ls not reliable

At first I was using call printf with %ls but it sometimes exhibited a strange failure where nothing at all was output. It is also unhelpful for non-ASCII in my experience and will have undetermined behavior when the underlying wcstomb conversion fails. But FWIW, this is what I used (I had to cast the printf to (void) or gdb would complain "no return type information available"):

define wc_print
call (void)printf("\"%ls\"\n",$arg0)
end

Some googling uncovered a developer who described similar symptoms of a bug in printf %ls on cygwin (I was using OS X 10.5.7 with gdb 6.3.50). Maybe there are platforms where this is reliable, but it wasn't for me.

What happens with normal gdb print

When you normally print a std::wstring or wchar_t string you get output like this:

$3 = {
  static npos = 4294967295,
  _M_dataplus = {
    > = {
	  <__gnu_cxx::new_allocator> = {},  },
	members of std::basic_string,std::allocator >::_Alloc_hider:
    _M_p = 0x808e0c
  }
}
$4 = (const wchar_t *) 0x44920

It doesn't show the text content like it does with a char* like this:

$2 = 0x3e090 "Hello"

So instead of "p <wide_string_variable_name>" you can now just use "wc <wide_string_variable_name>".

Defining a gdb command

By defining the command, it is very convenient to display any wide string. You can put the define in a file and source the file when you need it (using the gdb source command) or put it in your .gdbinit file. If your .gdbinit file is in your work folder, you will have the command every time you start gdb there.

You can also add the following after your command definition to document it:

document wc_print
wc_print <wide_string_variable_name>
Display <wide_string_variable_name> which is a wchar_t* or wstring.
end