DWARF tree for fully-qualified name construction

The Windows debuggers expect PDB symbol names to be fully qualified. I.e., if a class Foo has a constructor, its name should be emitted as `Foo::Foo`, not simply `Foo` as is the case today. Linux debuggers like GDB dynamically reconstruct the symbol tree at runtime each time a program is debugged. Windows debuggers on the other hand do not, and expect the name to be fully qualified from the outset. Failing this, the constructor function `Foo` would have the same name as the class `Foo` in the PDB, and WinDbg will get confused about what to dump (e.g. using `dt Foo`) and arbitrarily pick the largest item, which might be the constructor. Therefore you end up dumping the wrong thing and being completely unable to inspect the contents of a `Foo` object. This commit aims to fix that by introducing a DWARF tree during the conversion process which allows us to efficiently reconstruct such fully qualified names during the conversion. A note about DWARF: the DWARF format does not explicitly record the parent of any given DIE record. It is instead implicit in how the records are layed out. Any record may have a "has children" flag, and if it does, then the records following it are its children, terminated by a special NULL record, popping back up one level of the tree. The DIECursor already recognized this structure but did not capture it in memory for later use. In order to construct fully-qualified names for functions, enums, classes, etc. (i.e. taking into account namespaces, nesting, etc), we need a way to efficienctly lookup a node's parent. Thus the DWARF tree was born. At a high level, we take advantage of the fact that the DWARF sections were already scanned in two passes. We hook into the first pass (where the typeIDs were being reserved) and build the DWARF tree. Then, in the second pass (where the CV symbols get emitted), we look up the tree to figure out the correct fully-qualified symbol names. NOTE: The first phase of this work focuses on subroutines only. Later work will enable support for structs/classes/enums. On the subroutine front, I also added a flag to capture whether a DIE is a "declaration" or definition (based on the DW_AT_declaration attribute). This is needed to consolidate function decl+defn into one PDB symbol, as otherwise WinDbg will get confused. This also matches what the MSVC toolset produces. A few other related additions: - Added helper to format a fully qualified function name by looking up the tree added in this commit. - Added helper to print the DWARF tree for debugging purposes and a flag to control it.
author: Alex Budovski <alexbud@meta.com> 2023-03-23 01:37:01 (GMT)
committer: Alex Budovski <alexbud@meta.com> 2023-03-24 15:12:48 (GMT)
commit: 62f975d2b4030d10a50e140f44f39ede418bcec4 (patch)
tree: 3c87638dd38e81bdd851e7257353d36ab6cc188c /src/cv2pdb.h
parent: 2e4c1bf97b1491385c37432aef58b15943eb118a (diff)
download: cv2pdb-62f975d2b4030d10a50e140f44f39ede418bcec4.zip
cv2pdb-62f975d2b4030d10a50e140f44f39ede418bcec4.tar.gz
cv2pdb-62f975d2b4030d10a50e140f44f39ede418bcec4.tar.bz2
1 files changed, 17 insertions, 4 deletions
diff --git a/src/cv2pdb.h b/src/cv2pdb.h
index 654470b..e5e8144 100644
--- a/src/cv2pdb.h
+++ b/src/cv2pdb.h
@@ -169,17 +169,23 @@ public:
 	bool addDWARFLines();
 	bool addDWARFPublics();
 	bool writeDWARFImage(const TCHAR* opath);
+	DWARF_InfoData* findEntryByPtr(byte* entryPtr) const;
+
+	// Helper to just print the DWARF tree we've built for debugging purposes.
+	void dumpDwarfTree() const;
 
 	bool addDWARFSectionContrib(mspdb::Mod* mod, unsigned long pclo, unsigned long pchi);
 	bool addDWARFProc(DWARF_InfoData& id, const std::vector<RangeEntry> &ranges, DIECursor cursor);
+	void formatFullyQualifiedProcName(const DWARF_InfoData* proc, char* buf, size_t cbBuf) const;
+
 	int  addDWARFStructure(DWARF_InfoData& id, DIECursor cursor);
-	int  addDWARFFields(DWARF_InfoData& structid, DIECursor cursor, int off, int flStart);
-	int  addDWARFArray(DWARF_InfoData& arrayid, DIECursor cursor);
+	int  addDWARFFields(DWARF_InfoData& structid, DIECursor& cursor, int off, int flStart);
+	int  addDWARFArray(DWARF_InfoData& arrayid, const DIECursor& cursor);
 	int  addDWARFBasicType(const char*name, int encoding, int byte_size);
 	int  addDWARFEnum(DWARF_InfoData& enumid, DIECursor cursor);
 	int  getTypeByDWARFPtr(byte* ptr);
 	int  getDWARFTypeSize(const DIECursor& parent, byte* ptr);
-	void getDWARFArrayBounds(DWARF_InfoData& arrayid, DIECursor cursor,
+	void getDWARFArrayBounds(DIECursor cursor,
 		int& basetype, int& lowerBound, int& upperBound);
 	void getDWARFSubrangeInfo(DWARF_InfoData& subrangeid, const DIECursor& parent,
 		int& basetype, int& lowerBound, int& upperBound);
@@ -278,7 +284,14 @@ public:
 
 	// DWARF
 	int codeSegOff;
-	std::unordered_map<byte*, int> mapOffsetToType;
+
+	// Lookup table for type IDs based on the DWARF_InfoData::entryPtr
+	std::unordered_map<byte*, int> mapEntryPtrToTypeID;
+	// Lookup table for entries based on the DWARF_InfoData::entryPtr
+	std::unordered_map<byte*, DWARF_InfoData*> mapEntryPtrToEntry;
+
+	// Head of list of DWARF DIE nodes.
+	DWARF_InfoData* dwarfHead = nullptr;
 
 	// Default lower bound for the current compilation unit. This depends on
 	// the language of the current unit.
author	Alex Budovski <alexbud@meta.com>	2023-03-23 01:37:01 (GMT)
committer	Alex Budovski <alexbud@meta.com>	2023-03-24 15:12:48 (GMT)
commit	62f975d2b4030d10a50e140f44f39ede418bcec4 (patch)
tree	3c87638dd38e81bdd851e7257353d36ab6cc188c /src/cv2pdb.h
parent	2e4c1bf97b1491385c37432aef58b15943eb118a (diff)
download	cv2pdb-62f975d2b4030d10a50e140f44f39ede418bcec4.zip cv2pdb-62f975d2b4030d10a50e140f44f39ede418bcec4.tar.gz cv2pdb-62f975d2b4030d10a50e140f44f39ede418bcec4.tar.bz2