Microsoft C++ Name Mangling Scheme

version 1.1 (November 25, 2005)

This document discusses C++ name mangling scheme used by Microsoft. I think this is the most complete document about this scheme currently.

Author

This document is maintained by Kang Seonghoon aka Tokigun.

If you want to discuss about this scheme or document please mail me: <xxxxxxx at gmail dot com> where xxxxxxx is tokigun.

And sorry for my poor English ;)

Reference

Even though I could disassemble dbghelp.dll(or msvcrt.dll), I didn't do it because of the legal issues. So I used only UnDecorateSymbolName function to analysis this scheme.

Also I learned the basic scheme from the following source:

wine's __unDname function implementation (see /wine/dlls/msvcrt/undname.c)
http://www.kegel.com/mangle.html

Some feature of this scheme depends on Microsoft's C++ extension, such as Managed C++. Browse MSDN for more information.

Basic Structure

As you know, all mangled C++ name starts with ?. Because all mangled C name starts with alphanumeric characters, @(at-sign) and _(underscore), C++ name can be distinguished from C name.

Structure of mangled name looks like this:

Prefix ?
Optional: Prefix @? [TODO: what does CV: mean?]
Qualified name
Type information (see below)

Function

Type information in function name generally looks like this:

Access level and function type
Conditional: CV-class modifier of function, if non-static member function
Function property

Data

Type information in data name looks like this:

Access level and storage class
Data type
CV-class modifier

Elements

Mangled name contains a lot of elements have to be discussed.

Name

Qualified name consists of the following fragments:

Basic name: one of name fragment and special name
Qualification #1: one of name fragment, name with template arguments, numbered namespace and back reference
Qualification #2
...
Terminator @

Qualification is written in reversed order. For example myclass::nested::something becomes something@nested@myclass@@.

Name Fragment

A fragment of name is simply represented as the name with trailing @.

Special Name

Special name is represented as the code with preceding ?. Most of special name is constructor, destructor, operator and internal symbol. Below is a table for known codes.

Code	Meaning with no `_`	Meaning with preceding `_`	Meaning with preceding two `_`s
`0`	Constructor	`operator/=`
`1`	Destructor	`operator%=`
`2`	`operator new`	`operator>>=`
`3`	`operator delete`	`operator<<=`
`4`	`operator=`	`operator&=`
`5`	`operator>>`	`operator\|=`
`6`	`operator<<`	`operator^=`
`7`	`operator!`	`vftable'
`8`	`operator==`	`vbtable'
`9`	`operator!=`	`vcall'
`A`	`operator[]`	`typeof'	`managed vector constructor iterator'
`B`	`operator returntype`^[1]	`local static guard'	`managed vector destructor iterator'
`C`	`operator->`	`string' (Unknown)^[2]	`eh vector copy constructor iterator'
`D`	`operator*`	`vbase destructor'	`eh vector vbase copy constructor iterator'
`E`	`operator++`	`vector deleting destructor'
`F`	`operator--`	`default constructor closure'
`G`	`operator-`	`scalar deleting destructor'
`H`	`operator+`	`vector constructor iterator'
`I`	`operator&`	`vector destructor iterator'
`J`	`operator->*`	`vector vbase constructor iterator'
`K`	`operator/`	`virtual displacement map'
`L`	`operator%`	`eh vector constructor iterator'
`M`	`operator<`	`eh vector destructor iterator'
`N`	`operator<=`	`eh vector vbase constructor iterator'
`O`	`operator>`	`copy constructor closure'
`P`	`operator>=`	`udt returning' (prefix)
`Q`	`operator,`	Unknown^[3]
`R`	`operator()`	RTTI-related code (see below)
`S`	`operator~`	`local vftable'
`T`	`operator^`	`local vftable constructor closure'
`U`	`operator\|`	`operator new[]`
`V`	`operator&&`	`operator delete[]`
`W`	`operator\|\|`
`X`	`operator*=`	`placement delete closure'
`Y`	`operator+=`	`placement delete[] closure'
`Z`	`operator-=`

Prefix _P is used as ?_PX, though I don't know about it. [TODO: what is udt? user defined type?]

Below is RTTI-related code (all starting with _R). Some codes have trailing parameters.

Code	Meaning	Trailing Parameters
`_R0`	type `RTTI Type Descriptor'	Data type type.
`_R1`	`RTTI Base Class Descriptor at (a,b,c,d)'	Four encoded numbers a, b, c and d.
`_R2`	`RTTI Base Class Array'	None.
`_R3`	`RTTI Class Hierarchy Descriptor'	None.
`_R4`	`RTTI Complete Object Locator'	None.

Name with Template Arguments

Name fragment starting with ?$ has template arguments. This kind of name looks like this:

Prefix ?$
Name terminated by @
Template argument list

For example, we assume the following prototype.

void __cdecl abc<def<int>,void*>::xyz(void);

Name of this function can be obtained by the following process:

abc<def<int>,void*>::xyz
xyz@ abc<def<int>,void*> @
xyz@ ?$abc@ def<int> void* @ @
xyz@ ?$abc@ V def<int> @ PAX @ @
xyz@ ?$abc@ V ?$def@H@ @ PAX @ @
xyz@?$abc@V?$def@H@@PAX@@

So mangled name for this function is ?xyz@?$abc@V?$def@H@@PAX@@YAXXZ.

Numbered Namespace

In qualification, numbered namespace is represented as preceding ? and unsigned number. UnDecorateSymbolName function returns something like `42' for this kind of input.

Exceptionally if numbered namespace starts with ?A it becomes anonymous namespace (`anonymous namespace').

Well, of course I'm not sure what it is. [TODO: what is exact meaning and name? I don't think its name is really "numbered namespace".]

Back Reference

Decimal digits 0 to 9 refers to first shown name fragment to 10th shown name fragment. Referred name fragment can be normal name fragment or name fragment with template arguments. For example, in alpha@?1beta@@(beta::`2'::alpha) 0 refers to alpha@, and 1(not 2) refers to beta@.

Generally back reference table is kept during mangling process. It means you can use back reference to function name in function arguments (shown later than function name). However, in template argument list back reference table is separately created.

For example, assume ?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@(std::basic_string<unsigned short, std::char_traits<unsigned short>, std::allocator<unsigned short> >). In std::basic_string<...>, 0 refers to basic_string@, 1 refers to ?$char_traits@G@, and 2 refers to std@. This relation doesn't change wherever it is.

Encoded Number

In name mangling, representation of number is needed sometimes (e.g. array indices). There are simple rules to represent number:

0 to 9 represents number 1 to 10.
num@ represents hexadecimal number, where num consists of hexadecimal digit A(means 0) to P(means 15). For example BCD@ means number 0x123, that is 291.
@ represents number 0.
If allowed, prefix ? represents minus sign. Note that both ?@ and @ represents number 0.

Data Type

The table below shows various data type and modifiers.

Code	Meaning with no `_`	Meaning with preceding `_`
?	Type modifier, Template parameter
$	Type modifier, Template parameter^[4]	__w64 (prefix)
0-9	Back reference
A	Type modifier (reference)
B	Type modifier (volatile reference)
C	signed char
D	char	__int8
E	unsigned char	unsigned __int8
F	short	__int16
G	unsigned short	unsigned __int16
H	int	__int32
I	unsigned int	unsigned __int32
J	long	__int64
K	unsigned long	unsigned __int64
L		__int128
M	float	unsigned __int128
N	double	bool
O	long double	Array
P	Type modifier (pointer)
Q	Type modifier (const pointer)
R	Type modifier (volatile pointer)
S	Type modifier (const volatile pointer)
T	Complex Type (union)
U	Complex Type (struct)
V	Complex Type (class)
W	Enumerate Type (enum)	wchar_t
X	void, Complex Type (coclass)	Complex Type (coclass)
Y	Complex Type (cointerface)	Complex Type (cointerface)
Z	... (elipsis)

Actually void for X and elipsis for Z can be used only for terminator of argument list or pointer. Otherwise, X is used as cointerface.

Primitive & Extended Type

Primitive types are represented as one character, and extended types are represented as one character preceding _.

Back Reference

Decimal digits 0 to 9 refers to first shown type to 10th shown type in argument list. (It means return type cannot be referred.) Back reference can refer to any non-primitive type, including extended type. Of course back reference can refer to prefixed type such as PAVblah@@(class blah *), but cannot refer to prefixless type — say, Vblah@@ in PAVblah@@.

As back reference for name, in template argument list back reference table is separately created. Function argument list has no such scoping rule, though it can be confused sometimes. For example, assume P6AXValpha@@Vbeta@@@Z(void (__cdecl*)(class alpha, class beta)) is first shown non-primitive type. Then 0 refer to Valpha@@, 1 refer to Vbeta@@, and finally 2 refer to function pointer.

Type Modifier

Type modifier is used to make pointer or reference. Type modifier looks like this:

Modifier type
Optional: Managed C++ property ($A for __gc, $B for __pin)
CV-class modifier
Optional: Array property (not for function)
- Prefix Y
- Encoded unsigned number of dimension
- Array indices as encoded unsigned number, dimension times
Referred type info (see below)

There is eight type of type modifier:

	none	const	volatile	const volatile
Pointer	`P`	`Q`	`R`	`S`
Reference	`A`		`B`
none	`?`^[5], `$$C`

For normal type, referred type info is data type. For function, it looks like the following. (It depends on CV-class modifier)

Conditional: CV-class modifier, if member function
Function property

Complex Type (union, struct, class, coclass, cointerface)

Complex type looks like this:

Kind of complex type (T, U, V, ...)^[6]
Qualification without basic name

Enumerate Type (enum)

Enumerate type starts with prefix W. It looks like this:

Prefix W
Real type for enum
Qualification without basic name

Real type for enum is represented as the following:

Code	Corresponding Real Type
`0`	char
`1`	unsigned char
`2`	short
`3`	unsigned short
`4`	int (generally normal "enum")
`5`	unsigned int
`6`	long
`7`	unsigned long

Array

Array (not pointer to array!) starts with prefix _O. It looks like this:

Prefix _O
CV-class modifier
Data type within array

You can use multi-dimensional array like _OC_OBH, but only the outmost CV-class modifier is affected. (In this case _OC_OBH means int volatile [][], not int const [][])

Template Parameter

Template parameter is used to represent type and non-type template argument. It can be used in only template argument list.

The table below is a list of known template parameters. a, b, c represent encoded signed numbers, and x, y, z represent encoded unsigned numbers.

Code	Meaning
`?x`	anonymous type template parameter x (`template-parameter-x')
`$0a`	integer value a
`$2ab`	real value a × 10^b-k+1, where k is number of decimal digits of a^[7]
`$Da`	anonymous type template parameter a (`template-parametera')
`$Fab`	2-tuple {a,b} (unknown)
`$Gabc`	3-tuple {a,b,c} (unknown)
`$Hx`	(unknown)
`$Ixy`	(unknown)
`$Jxyz`	(unknown)
`$Qa`	anonymous non-type template parameter a (`non-type-template-parametera')

Argument List

Argument list is a sequence of data types. List can be one of the following:

X (means void, also terminating list)
arg1 arg2 ... argN @ (means normal list of data types. Note that N can be zero)
arg1 arg2 ... argN Z (means list with trailing elipsis)

Template Argument List

Template argument list is same to argument list, except template parameters can be used.

CV-class Modifier

The following table shows CV-class modifiers.^*

	Variable				Function
	none	const	volatile	const volatile	Function
none	`A`	`B`, `J`	`C`, `G`, `K`	`D`, `H`, `L`	`6`, `7`
__based()	`M`	`N`	`O`	`P`	`_A`, `_B`
Member	`Q`, `U`, `Y`	`R`, `V`, `Z`	`S`, `W`, `0`	`T`, `X`, `1`	`8`, `9`
__based() Member	`2`	`3`	`4`	`5`	`_C`, `_D`

CV-class modifier can have zero or more prefix:

Prefix	Meaning
`E`	type __ptr64
`F`	__unaligned type
`I`	type __restrict

Modifiers have trailing parameters as follows:

Conditional: Qualification without basic name, if member
Conditional: CV-class modifier of function, if member function
Conditional: __based() property, if used

CV-class modifier is usually used in reference/pointer type, but it is also used in other place with some restrictions:

Modifier of function: can only have const, volatile attribute, optionally with prefixes.
Modifier of data: cannot have function property.

__based() Property

__based() property represents Microsoft's __based() attribute extension to C++. This property can be one of the following:

0 (means __based(void))
2name (means __based(name), where name is qualification without basic name)
5 (means no __based())

Function Property

Function property represents prototype of function. It looks like this:

Calling convention of function
Data type of returned value, or @ for void
Argument list
throw() attribute

The following table shows calling convention of function:

Code	Exported?	Calling Convention
`A`	No	__cdecl
`B`	Yes	__cdecl
`C`	No	__pascal
`D`	Yes	__pascal
`E`	No	__thiscall
`F`	Yes	__thiscall
`G`	No	__stdcall
`H`	Yes	__stdcall
`I`	No	__fastcall
`J`	Yes	__fastcall
`K`	No	none
`L`	Yes	none
`M`	No	__clrcall

Argument list for throw() attribute is same to argument list, but if this list is Z, it means there is no throw() attribute. If you want to use throw() you have to use @ instead.

Function

Typical type information in function name looks like this:

Optional: Prefix _ (means __based() property is used)
Access level and function type
Conditional: __based() property, if used
Conditional: adjustor property (as encoded unsigned number), if thunk function
Conditional: CV-class modifier of function, if non-static member function
Function property

The table below shows code for access level and function type:

	none	static	virtual	thunk
private:	`A`, `B`	`C`, `D`	`E`, `F`	`G`, `H`
protected:	`I`, `J`	`K`, `L`	`M`, `N`	`O`, `P`
public:	`Q`, `R`	`S`, `T`	`U`, `V`	`W`, `X`
none	`Y`, `Z`

This kind of thunk function is always virtual, and used to represent logical this adjustor property, which means an offset to true this value in some multiple inheritance.

Data

Type information in data name looks like this:

Access level and storage class
Data type
CV-class modifier

The table below shows code for access level and storage class:

Code	Meaning
`0`	Private static member
`1`	Protected static member
`2`	Public static member
`3`	Normal variable
`4`	Normal variable

CV-class modifier should not be function.

Thunk Function

There is several kind of thunk function. [TODO: a lot of thunk function!]

Footnotes

* Some tables contain two or more entries in one case. In this case, I tried to place more frequently used entry in the front. (But I'm not sure that this placement. Don't ask it for me!)

[1] Its meaning depends on return type of function. For instance, if this function returns int type then its name will be operator int.

[2] It seems structure after ?_C is different from other structure. I think this structure is represented as regular expression \?_C@_[0-9A-P]([0-9A-P][A-P]*)?@.*@, but I'm not sure.

[3] It can be EH-related code, but UnDecorateSymbolName function cannot demangle this.

[4] There is $$B prefix, but it seems that this prefix can be ignored.

[5] ? is valid only for type of data. Also ? should be the outmost type modifier. (?CPB is valid but PB?C is not.)

[6] ? and L can be complex type without any tag such as class, but it can also be a bug of the function.

[7] For example, $2HKLH@?2 means 3.1415 × 10^-3 = 0.0031415, because HKLH@ means 31415 and ?2 means -3.