Solidity offers many high-level language abstractions, but these features make it hard to understand what’s really going on when my program is running.
Reading the Solidity documentation still left me confused over very basic things.
What are the differences between string, bytes32, byte[], bytes?
Which one do I use, when?
What’s happening when I cast a string to bytes? Can I cast to byte[]?
How much do they cost?
How are mappings stored by the EVM?
Why can’t I delete a mapping?
Can I have mappings of mappings? (Yes, but how does that work?)
Why is there storage mapping, but no memory mapping?
How does a compiled contract look to the EVM?
How is a contract created?
What is a constructor, really?
What is the fallback function?
One Storage Variables:
1 2 3 4 5 6 7 8
// c1.sol pragma solidity ^0.4.11; contract C { // a constructor uint256 a; // a state variable functionC() { a = 1; } }
The EVM is a stack machine.
Instructions指令 might use values on the stack as arguments, and push values onto the stack as results. Let’s consider the operation add.
栈里有两个值
1
[1 2]
When the EVM sees add, it adds the top 2 items together, and pushes the answer back onto the stack, resulting in:
1
[3]
符号表示:
In what follows, we’ll notate the stack with []:
1 2 3 4
// The empty stack stack: [] // Stack with three items. The top item is 3. The bottom item is 1. stack: [3 2 1]
And notate the contract storage with {}:
1 2 3 4
// Nothing in storage. store: {} // The value 0x1 is stored at the position 0x0. store: { 0x0 => 0x1 }
模拟:
Let’s now look at some real bytecode. We’ll simulate the bytecode sequence 6001600081905550(a=1) as EVM would, and print out the machine state after each instruction:
// 60 01: pushes 1 onto stack 0x1 stack: [0x1] // 60 00: pushes 0 onto stack 0x0 stack: [0x0 0x1] // 81: duplicate the second item on the stack dup2 stack: [0x1 0x0 0x1] // 90: swap the top two items swap1 stack: [0x0 0x1 0x1] // 55: store the value 0x1 at position 0x0 // This instruction consumes消费 the top 2 items sstore stack: [0x1] store: { 0x0 => 0x1 } // 50: pop (throw away the top item) pop stack: [] store: { 0x0 => 0x1 }
The end. The stack is empty, and there’s one item in storage.
What’s worth noting is that Solidity had decided to store the state variable uint256 a at the position 0x0.
It’s perfectly possible for other languages to choose to store the state variable elsewhere.
等价表示:
In pseudocode, what the EVM does for 6001600081905550 is essentially:
1 2
// a = 1 sstore(0x0, 0x1)
其中,dup2, swap1, pop是多余的,汇编代码可以更简单
1 2 3
0x1 0x0 sstore
You could try to simulate the above 3 instructions, and satisfy yourself that they indeed result in the same machine state:
怎么自己模拟指令呢。。。通过推演的方式?
1 2
stack: [] store: { 0x0 => 0x1 }
Two Storage Variables:
add one extra storage variable
1 2 3 4 5 6 7 8 9 10
// c2.sol pragma solidity ^0.4.11; contract C { uint256 a; uint256 b; functionC() { a = 1; b = 2; } }
// a = 1 sstore(0x0, 0x1) // b = 2 sstore(0x1, 0x2)
What we learn here is that the two storage variables are positioned one after the other, with a in position 0x0 and b in position 0x1.
Storage Packing:
Each slot storage can store 32 bytes. It’d be wasteful to use all 32 bytes if a variable only needs 16 bytes.
Solidity optimizes for storage efficiency by packing two smaller data types into one storage slot if possible.
Let’s change a and b so they are only 16 bytes each:
1 2 3 4 5 6 7 8 9
pragma solidity ^0.4.11; contract C { uint128 a; uint128 b; function C() { a = 1; b = 2; } }
The above assembly code packs these two variables together in one storage position (0x0), like this:
1 2
[ b ][ a ] [16 bytes / 128 bits][16 bytes / 128 bits]
The reason to pack is because the most expensive operations by far are storage usage:
sstore costs 20000 gas for first write to a new position.
sstore costs 5000 gas for subsequent writes to an existing position.
sload costs 500 gas.
Most instructions costs 3~10 gases.
By using the same storage position, Solidity pays 5000 for the second store variable instead of 20000, saving us 15000 in gas.
More Optimization:
Instead of storing a and b with two separate sstore instructions, it should be possible to pack the two 128 bits numbers together in memory, then store them using just one sstore, saving an additional 5000 gas.
You can ask Solidity to make this optimization by turning on the optimize flag:
1
$ solc --bin --asm --optimize c3.sol
Which produces assembly code that uses just one sload and one sstore: