C++ 的 sizeof 是怎么实现的？第1页

lan-se-52-30 网友的相关建议:

sizeof的东西会被编译器直接替换掉，即使是汇编代码都只能看到一个常量，所以下面有童鞋说看反汇编源码是不行的，因为已经在编译器内部替换掉了（更严谨的说法是，VLA是特殊情况，这是后面的代码说明中有提到）。下面以Clang对sizeof的处理来看sizeof的实现。

在Clang的实现中，在lib/AST/ExprConstant.cpp中有这样的方法：

       bool IntExprEvaluator::VisitUnaryExprOrTypeTraitExpr

这个方法的实现如此：

       switch(E->getKind()) {   case UETT_AlignOf: {     if (E->isArgumentType())       return Success(GetAlignOfType(Info, E->getArgumentType()), E);     else       return Success(GetAlignOfExpr(Info, E->getArgumentExpr()), E);   }    case UETT_VecStep: {     QualType Ty = E->getTypeOfArgument();      if (Ty->isVectorType()) {       unsigned n = Ty->castAs<VectorType>()->getNumElements();        // The vec_step built-in functions that take a 3-component       // vector return 4. (OpenCL 1.1 spec 6.11.12)       if (n == 3)         n = 4;        return Success(n, E);     } else       return Success(1, E);   }    case UETT_SizeOf: {     QualType SrcTy = E->getTypeOfArgument();     // C++ [expr.sizeof]p2: "When applied to a reference or a reference type,     //   the result is the size of the referenced type."     if (const ReferenceType *Ref = SrcTy->getAs<ReferenceType>())       SrcTy = Ref->getPointeeType();      CharUnits Sizeof;     if (!HandleSizeof(Info, E->getExprLoc(), SrcTy, Sizeof))       return false;     return Success(Sizeof, E);   }   }    llvm_unreachable("unknown expr/type trait"); }

然后通过这个方法，我们可以顺藤摸瓜，发现sizeof的处理其实是在HandleSizeof这个方法内，结果是会存储在Sizeof这个CharUnits中，而一个CharUnits是Clang内部的一个表示，引用Clang的注释如下

         /// CharUnits - This is an opaque type for sizes expressed in character units.   /// Instances of this type represent a quantity as a multiple of the size   /// of the standard C type, char, on the target architecture. As an opaque   /// type, CharUnits protects you from accidentally combining operations on   /// quantities in bit units and character units.   ///   /// In both C and C++, an object of type 'char', 'signed char', or 'unsigned   /// char' occupies exactly one byte, so 'character unit' and 'byte' refer to   /// the same quantity of storage. However, we use the term 'character unit'   /// rather than 'byte' to avoid an implication that a character unit is   /// exactly 8 bits.   ///   /// For portability, never assume that a target character is 8 bits wide. Use   /// CharUnit values wherever you calculate sizes, offsets, or alignments   /// in character units.

然后，我们找寻HandleSizeof方法：

       /// Get the size of the given type in char units. static bool HandleSizeof(EvalInfo &Info, SourceLocation Loc,                          QualType Type, CharUnits &Size) {   // sizeof(void), __alignof__(void), sizeof(function) = 1 as a gcc   // extension.   if (Type->isVoidType() || Type->isFunctionType()) {     Size = CharUnits::One();     return true;   }    if (!Type->isConstantSizeType()) {     // sizeof(vla) is not a constantexpr: C99 6.5.3.4p2.     // FIXME: Better diagnostic.     Info.Diag(Loc);     return false;   }    Size = Info.Ctx.getTypeSizeInChars(Type);   return true; }

走到这里，我们就知道了为什么会被替换掉了，如你这里是void或者Function type，编译器都直接替换为CharUnits::One()这个常量（即一个Char的大小），所以这就是汇编也只能看到常量的原因，毕竟汇编是后面CodeGen的事情，而这里是在CodeGen之前发生的了。而在这里也会判断Type是不是ConstantSizeType，因为需要在编译期计算出来，而注释则是针对VLA，有兴趣的同学可以按照注释的C99地方去看说的是什么。接下来则是把Type传给getTypeSizeInChars方法了。

OK，接下来我们再一步一步的走下去，看getTypeSizeInChars做了什么。

       /// getTypeSizeInChars - Return the size of the specified type, in characters. /// This method does not work on incomplete types. CharUnits ASTContext::getTypeSizeInChars(QualType T) const {   return getTypeInfoInChars(T).first; }

走到这里的时候，虽然我们就算不走下去都能知道这个方法是返回特定类型的大小了，但是我们还是要打破沙锅问到底，看到底是怎么实现的。于是我们继续走getTypeInfoChars()这个方法。

       std::pair<CharUnits, CharUnits> ASTContext::getTypeInfoInChars(QualType T) const {   return getTypeInfoInChars(T.getTypePtr()); }

走到这里，我们也知道为什么会有first了，因为这个方法返回的是一个std::pair，接下来我们可以发现调用的还是getTypeInChar方法，但是参数一个TypePointers，于是我们找这个重载方法：

       std::pair<CharUnits, CharUnits> ASTContext::getTypeInfoInChars(const Type *T) const {   if (const ConstantArrayType *CAT = dyn_cast<ConstantArrayType>(T))     return getConstantArrayInfoInChars(*this, CAT);   TypeInfo Info = getTypeInfo(T);   return std::make_pair(toCharUnitsFromBits(Info.Width),                         toCharUnitsFromBits(Info.Align)); }

随后，我们可以发现是getTypeInfo这个方法，然后我们找到对应的代码：

       TypeInfo ASTContext::getTypeInfo(const Type *T) const {   TypeInfoMap::iterator I = MemoizedTypeInfo.find(T);   if (I != MemoizedTypeInfo.end())     return I->second;    // This call can invalidate MemoizedTypeInfo[T], so we need a second lookup.   TypeInfo TI = getTypeInfoImpl(T);   MemoizedTypeInfo[T] = TI;   return TI; }

然后我们找到了这个，对于MemorizedTypeInfo我们暂时不需要关心，我们也能发现需要的东西其实在getTypeInfoImpl里面

       /// getTypeInfoImpl - Return the size of the specified type, in bits.  This /// method does not work on incomplete types. /// /// FIXME: Pointers into different addr spaces could have different sizes and /// alignment requirements: getPointerInfo should take an AddrSpace, this /// should take a QualType, &c. TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const {   uint64_t Width = 0;   unsigned Align = 8;   bool AlignIsRequired = false;   switch (T->getTypeClass()) { #define TYPE(Class, Base) #define ABSTRACT_TYPE(Class, Base) #define NON_CANONICAL_TYPE(Class, Base) #define DEPENDENT_TYPE(Class, Base) case Type::Class: #define NON_CANONICAL_UNLESS_DEPENDENT_TYPE(Class, Base)                          case Type::Class:                                                               assert(!T->isDependentType() && "should not see dependent types here");         return getTypeInfo(cast<Class##Type>(T)->desugar().getTypePtr()); #include "clang/AST/TypeNodes.def"     llvm_unreachable("Should not see dependent types");    case Type::FunctionNoProto:   case Type::FunctionProto:     // GCC extension: alignof(function) = 32 bits     Width = 0;     Align = 32;     break;    case Type::IncompleteArray:   case Type::VariableArray:     Width = 0;     Align = getTypeAlign(cast<ArrayType>(T)->getElementType());     break;    case Type::ConstantArray: {     const ConstantArrayType *CAT = cast<ConstantArrayType>(T);      TypeInfo EltInfo = getTypeInfo(CAT->getElementType());     uint64_t Size = CAT->getSize().getZExtValue();     assert((Size == 0 || EltInfo.Width <= (uint64_t)(-1) / Size) &&            "Overflow in array type bit size evaluation");     Width = EltInfo.Width * Size;     Align = EltInfo.Align;     if (!getTargetInfo().getCXXABI().isMicrosoft() ||         getTargetInfo().getPointerWidth(0) == 64)       Width = llvm::RoundUpToAlignment(Width, Align);     break;   }   case Type::ExtVector:   case Type::Vector: {     const VectorType *VT = cast<VectorType>(T);     TypeInfo EltInfo = getTypeInfo(VT->getElementType());     Width = EltInfo.Width * VT->getNumElements();     Align = Width;     // If the alignment is not a power of 2, round up to the next power of 2.     // This happens for non-power-of-2 length vectors.     if (Align & (Align-1)) {       Align = llvm::NextPowerOf2(Align);       Width = llvm::RoundUpToAlignment(Width, Align);     }     // Adjust the alignment based on the target max.     uint64_t TargetVectorAlign = Target->getMaxVectorAlign();     if (TargetVectorAlign && TargetVectorAlign < Align)       Align = TargetVectorAlign;     break;   }    case Type::Builtin:     switch (cast<BuiltinType>(T)->getKind()) {     default: llvm_unreachable("Unknown builtin type!");     case BuiltinType::Void:       // GCC extension: alignof(void) = 8 bits.       Width = 0;       Align = 8;       break;      case BuiltinType::Bool:       Width = Target->getBoolWidth();       Align = Target->getBoolAlign();       break;     case BuiltinType::Char_S:     case BuiltinType::Char_U:     case BuiltinType::UChar:     case BuiltinType::SChar:       Width = Target->getCharWidth();       Align = Target->getCharAlign();       break;     case BuiltinType::WChar_S:     case BuiltinType::WChar_U:       Width = Target->getWCharWidth();       Align = Target->getWCharAlign();       break;     case BuiltinType::Char16:       Width = Target->getChar16Width();       Align = Target->getChar16Align();       break;     case BuiltinType::Char32:       Width = Target->getChar32Width();       Align = Target->getChar32Align();       break;     case BuiltinType::UShort:     case BuiltinType::Short:       Width = Target->getShortWidth();       Align = Target->getShortAlign();       break;     case BuiltinType::UInt:     case BuiltinType::Int:       Width = Target->getIntWidth();       Align = Target->getIntAlign();       break;     case BuiltinType::ULong:     case BuiltinType::Long:       Width = Target->getLongWidth();       Align = Target->getLongAlign();       break;     case BuiltinType::ULongLong:     case BuiltinType::LongLong:       Width = Target->getLongLongWidth();       Align = Target->getLongLongAlign();       break;     case BuiltinType::Int128:     case BuiltinType::UInt128:       Width = 128;       Align = 128; // int128_t is 128-bit aligned on all targets.       break;     case BuiltinType::Half:       Width = Target->getHalfWidth();       Align = Target->getHalfAlign();       break;     case BuiltinType::Float:       Width = Target->getFloatWidth();       Align = Target->getFloatAlign();       break;     case BuiltinType::Double:       Width = Target->getDoubleWidth();       Align = Target->getDoubleAlign();       break;     case BuiltinType::LongDouble:       Width = Target->getLongDoubleWidth();       Align = Target->getLongDoubleAlign();       break;     case BuiltinType::NullPtr:       Width = Target->getPointerWidth(0); // C++ 3.9.1p11: sizeof(nullptr_t)       Align = Target->getPointerAlign(0); //   == sizeof(void*)       break;     case BuiltinType::ObjCId:     case BuiltinType::ObjCClass:     case BuiltinType::ObjCSel:       Width = Target->getPointerWidth(0);        Align = Target->getPointerAlign(0);       break;     case BuiltinType::OCLSampler:       // Samplers are modeled as integers.       Width = Target->getIntWidth();       Align = Target->getIntAlign();       break;     case BuiltinType::OCLEvent:     case BuiltinType::OCLImage1d:     case BuiltinType::OCLImage1dArray:     case BuiltinType::OCLImage1dBuffer:     case BuiltinType::OCLImage2d:     case BuiltinType::OCLImage2dArray:     case BuiltinType::OCLImage3d:       // Currently these types are pointers to opaque types.       Width = Target->getPointerWidth(0);       Align = Target->getPointerAlign(0);       break;     }     break;   case Type::ObjCObjectPointer:     Width = Target->getPointerWidth(0);     Align = Target->getPointerAlign(0);     break;   case Type::BlockPointer: {     unsigned AS = getTargetAddressSpace(         cast<BlockPointerType>(T)->getPointeeType());     Width = Target->getPointerWidth(AS);     Align = Target->getPointerAlign(AS);     break;   }   case Type::LValueReference:   case Type::RValueReference: {     // alignof and sizeof should never enter this code path here, so we go     // the pointer route.     unsigned AS = getTargetAddressSpace(         cast<ReferenceType>(T)->getPointeeType());     Width = Target->getPointerWidth(AS);     Align = Target->getPointerAlign(AS);     break;   }   case Type::Pointer: {     unsigned AS = getTargetAddressSpace(cast<PointerType>(T)->getPointeeType());     Width = Target->getPointerWidth(AS);     Align = Target->getPointerAlign(AS);     break;   }   case Type::MemberPointer: {     const MemberPointerType *MPT = cast<MemberPointerType>(T);     std::tie(Width, Align) = ABI->getMemberPointerWidthAndAlign(MPT);     break;   }   case Type::Complex: {     // Complex types have the same alignment as their elements, but twice the     // size.     TypeInfo EltInfo = getTypeInfo(cast<ComplexType>(T)->getElementType());     Width = EltInfo.Width * 2;     Align = EltInfo.Align;     break;   }   case Type::ObjCObject:     return getTypeInfo(cast<ObjCObjectType>(T)->getBaseType().getTypePtr());   case Type::Adjusted:   case Type::Decayed:     return getTypeInfo(cast<AdjustedType>(T)->getAdjustedType().getTypePtr());   case Type::ObjCInterface: {     const ObjCInterfaceType *ObjCI = cast<ObjCInterfaceType>(T);     const ASTRecordLayout &Layout = getASTObjCInterfaceLayout(ObjCI->getDecl());     Width = toBits(Layout.getSize());     Align = toBits(Layout.getAlignment());     break;   }   case Type::Record:   case Type::Enum: {     const TagType *TT = cast<TagType>(T);      if (TT->getDecl()->isInvalidDecl()) {       Width = 8;       Align = 8;       break;     }      if (const EnumType *ET = dyn_cast<EnumType>(TT)) {       const EnumDecl *ED = ET->getDecl();       TypeInfo Info =           getTypeInfo(ED->getIntegerType()->getUnqualifiedDesugaredType());       if (unsigned AttrAlign = ED->getMaxAlignment()) {         Info.Align = AttrAlign;         Info.AlignIsRequired = true;       }       return Info;     }      const RecordType *RT = cast<RecordType>(TT);     const RecordDecl *RD = RT->getDecl();     const ASTRecordLayout &Layout = getASTRecordLayout(RD);     Width = toBits(Layout.getSize());     Align = toBits(Layout.getAlignment());     AlignIsRequired = RD->hasAttr<AlignedAttr>();     break;   }    case Type::SubstTemplateTypeParm:     return getTypeInfo(cast<SubstTemplateTypeParmType>(T)->                        getReplacementType().getTypePtr());    case Type::Auto: {     const AutoType *A = cast<AutoType>(T);     assert(!A->getDeducedType().isNull() &&            "cannot request the size of an undeduced or dependent auto type");     return getTypeInfo(A->getDeducedType().getTypePtr());   }    case Type::Paren:     return getTypeInfo(cast<ParenType>(T)->getInnerType().getTypePtr());    case Type::Typedef: {     const TypedefNameDecl *Typedef = cast<TypedefType>(T)->getDecl();     TypeInfo Info = getTypeInfo(Typedef->getUnderlyingType().getTypePtr());     // If the typedef has an aligned attribute on it, it overrides any computed     // alignment we have.  This violates the GCC documentation (which says that     // attribute(aligned) can only round up) but matches its implementation.     if (unsigned AttrAlign = Typedef->getMaxAlignment()) {       Align = AttrAlign;       AlignIsRequired = true;     } else {       Align = Info.Align;       AlignIsRequired = Info.AlignIsRequired;     }     Width = Info.Width;     break;   }    case Type::Elaborated:     return getTypeInfo(cast<ElaboratedType>(T)->getNamedType().getTypePtr());    case Type::Attributed:     return getTypeInfo(                   cast<AttributedType>(T)->getEquivalentType().getTypePtr());    case Type::Atomic: {     // Start with the base type information.     TypeInfo Info = getTypeInfo(cast<AtomicType>(T)->getValueType());     Width = Info.Width;     Align = Info.Align;      // If the size of the type doesn't exceed the platform's max     // atomic promotion width, make the size and alignment more     // favorable to atomic operations:     if (Width != 0 && Width <= Target->getMaxAtomicPromoteWidth()) {       // Round the size up to a power of 2.       if (!llvm::isPowerOf2_64(Width))         Width = llvm::NextPowerOf2(Width);        // Set the alignment equal to the size.       Align = static_cast<unsigned>(Width);     }   }

一切真相大白了，已不需要解释了 :-)

C++ 的 sizeof 是怎么实现的？的其他答案点击这里

C++ 的 sizeof 是怎么实现的？第1页

相关话题

前一个讨论

下一个讨论

相关的话题

C++ 的 sizeof 是怎么实现的？ 第1页

相关话题

前一个讨论

下一个讨论

相关的话题

C++ 的 sizeof 是怎么实现的？第1页