Back
Featured image of post Autodetecting Unlocked Compiler Versions in Solidity

Autodetecting Unlocked Compiler Versions in Solidity

Building tools to automatically detect issues in Solidity code. Compatible with all existing EVMs

Table of Content

I want to show you how source code analyzers works with a simple example. Let’s build an analyzer that will check if Solidity source files contains a floating pragma declaration or not. There are some steps we need to follow, such as finding or building a good Solidity grammar file, parse the input content, build a parse tree, process the tree, and finally, find issues. All the process is explained below.

Solidity Language Grammar definition

I will be using Solidity provided ANTLR grammar file, but you can use any other grammar file, like this one.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
// Copyright 2016-2019 Federico Bond <federicobond@gmail.com>
// Licensed under the MIT license. See LICENSE file in the project root for details.

grammar Solidity;

sourceUnit
  : (pragmaDirective | importDirective | contractDefinition)* EOF ;

pragmaDirective
  : 'pragma' pragmaName pragmaValue ';' ;

pragmaName
  : identifier ;

pragmaValue
  : version | expression ;

version
  : versionConstraint versionConstraint? ;

versionOperator
  : '^' | '~' | '>=' | '>' | '<' | '<=' | '=' ;

versionConstraint
  : versionOperator? VersionLiteral ;

importDeclaration
  : identifier ('as' identifier)? ;

importDirective
  : 'import' StringLiteral ('as' identifier)? ';'
  | 'import' ('*' | identifier) ('as' identifier)? 'from' StringLiteral ';'
  | 'import' '{' importDeclaration ( ',' importDeclaration )* '}' 'from' StringLiteral ';' ;

NatSpecSingleLine
  : ('///' .*? [\r\n]) + ;

NatSpecMultiLine
  : '/**' .*? '*/' ;

natSpec
  : NatSpecSingleLine
  | NatSpecMultiLine ;

contractDefinition
  : natSpec? ( 'contract' | 'interface' | 'library' ) identifier
    ( 'is' inheritanceSpecifier (',' inheritanceSpecifier )* )?
    '{' contractPart* '}' ;

inheritanceSpecifier
  : userDefinedTypeName ( '(' expressionList? ')' )? ;

contractPart
  : stateVariableDeclaration
  | usingForDeclaration
  | structDefinition
  | constructorDefinition
  | modifierDefinition
  | functionDefinition
  | eventDefinition
  | enumDefinition ;

stateVariableDeclaration
  : typeName
    ( PublicKeyword | InternalKeyword | PrivateKeyword | ConstantKeyword )*
    identifier ('=' expression)? ';' ;

usingForDeclaration
  : 'using' identifier 'for' ('*' | typeName) ';' ;

structDefinition
  : 'struct' identifier
    '{' ( variableDeclaration ';' (variableDeclaration ';')* )? '}' ;

constructorDefinition
  : 'constructor' parameterList modifierList block ;

modifierDefinition
  : 'modifier' identifier parameterList? block ;

modifierInvocation
  : identifier ( '(' expressionList? ')' )? ;

functionDefinition
  : natSpec? 'function' identifier? parameterList modifierList returnParameters? ( ';' | block ) ;

returnParameters
  : 'returns' parameterList ;

modifierList
  : ( modifierInvocation | stateMutability | ExternalKeyword
    | PublicKeyword | InternalKeyword | PrivateKeyword )* ;

eventDefinition
  : natSpec? 'event' identifier eventParameterList AnonymousKeyword? ';' ;

enumValue
  : identifier ;

enumDefinition
  : 'enum' identifier '{' enumValue? (',' enumValue)* '}' ;

parameterList
  : '(' ( parameter (',' parameter)* )? ')' ;

parameter
  : typeName storageLocation? identifier? ;

eventParameterList
  : '(' ( eventParameter (',' eventParameter)* )? ')' ;

eventParameter
  : typeName IndexedKeyword? identifier? ;

functionTypeParameterList
  : '(' ( functionTypeParameter (',' functionTypeParameter)* )? ')' ;

functionTypeParameter
  : typeName storageLocation? ;

variableDeclaration
  : typeName storageLocation? identifier ;

typeName
  : elementaryTypeName
  | userDefinedTypeName
  | mapping
  | typeName '[' expression? ']'
  | functionTypeName
  | 'address' 'payable' ;

userDefinedTypeName
  : identifier ( '.' identifier )* ;

mapping
  : 'mapping' '(' elementaryTypeName '=>' typeName ')' ;

functionTypeName
  : 'function' functionTypeParameterList
    ( InternalKeyword | ExternalKeyword | stateMutability )*
    ( 'returns' functionTypeParameterList )? ;

storageLocation
  : 'memory' | 'storage' | 'calldata';

stateMutability
  : PureKeyword | ConstantKeyword | ViewKeyword | PayableKeyword ;

block
  : '{' statement* '}' ;

statement
  : ifStatement
  | whileStatement
  | forStatement
  | block
  | inlineAssemblyStatement
  | doWhileStatement
  | continueStatement
  | breakStatement
  | returnStatement
  | throwStatement
  | emitStatement
  | simpleStatement ;

expressionStatement
  : expression ';' ;

ifStatement
  : 'if' '(' expression ')' statement ( 'else' statement )? ;

whileStatement
  : 'while' '(' expression ')' statement ;

simpleStatement
  : ( variableDeclarationStatement | expressionStatement ) ;

forStatement
  : 'for' '(' ( simpleStatement | ';' ) ( expressionStatement | ';' ) expression? ')' statement ;

inlineAssemblyStatement
  : 'assembly' StringLiteral? assemblyBlock ;

doWhileStatement
  : 'do' statement 'while' '(' expression ')' ';' ;

continueStatement
  : 'continue' ';' ;

breakStatement
  : 'break' ';' ;

returnStatement
  : 'return' expression? ';' ;

throwStatement
  : 'throw' ';' ;

emitStatement
  : 'emit' functionCall ';' ;

variableDeclarationStatement
  : ( 'var' identifierList | variableDeclaration | '(' variableDeclarationList ')' ) ( '=' expression )? ';';

variableDeclarationList
  : variableDeclaration? (',' variableDeclaration? )* ;

identifierList
  : '(' ( identifier? ',' )* identifier? ')' ;

elementaryTypeName
  : 'address' | 'bool' | 'string' | 'var' | Int | Uint | 'byte' | Byte | Fixed | Ufixed ;

Int
  : 'int' | 'int8' | 'int16' | 'int24' | 'int32' | 'int40' | 'int48' | 'int56' | 'int64' | 'int72' | 'int80' | 'int88' | 'int96' | 'int104' | 'int112' | 'int120' | 'int128' | 'int136' | 'int144' | 'int152' | 'int160' | 'int168' | 'int176' | 'int184' | 'int192' | 'int200' | 'int208' | 'int216' | 'int224' | 'int232' | 'int240' | 'int248' | 'int256' ;

Uint
  : 'uint' | 'uint8' | 'uint16' | 'uint24' | 'uint32' | 'uint40' | 'uint48' | 'uint56' | 'uint64' | 'uint72' | 'uint80' | 'uint88' | 'uint96' | 'uint104' | 'uint112' | 'uint120' | 'uint128' | 'uint136' | 'uint144' | 'uint152' | 'uint160' | 'uint168' | 'uint176' | 'uint184' | 'uint192' | 'uint200' | 'uint208' | 'uint216' | 'uint224' | 'uint232' | 'uint240' | 'uint248' | 'uint256' ;

Byte
  : 'bytes' | 'bytes1' | 'bytes2' | 'bytes3' | 'bytes4' | 'bytes5' | 'bytes6' | 'bytes7' | 'bytes8' | 'bytes9' | 'bytes10' | 'bytes11' | 'bytes12' | 'bytes13' | 'bytes14' | 'bytes15' | 'bytes16' | 'bytes17' | 'bytes18' | 'bytes19' | 'bytes20' | 'bytes21' | 'bytes22' | 'bytes23' | 'bytes24' | 'bytes25' | 'bytes26' | 'bytes27' | 'bytes28' | 'bytes29' | 'bytes30' | 'bytes31' | 'bytes32' ;

Fixed
  : 'fixed' | ( 'fixed' [0-9]+ 'x' [0-9]+ ) ;

Ufixed
  : 'ufixed' | ( 'ufixed' [0-9]+ 'x' [0-9]+ ) ;

expression
  : expression ('++' | '--')
  | 'new' typeName
  | expression '[' expression ']'
  | expression '(' functionCallArguments ')'
  | expression '.' identifier
  | '(' expression ')'
  | ('++' | '--') expression
  | ('+' | '-') expression
  | ('after' | 'delete') expression
  | '!' expression
  | '~' expression
  | expression '**' expression
  | expression ('*' | '/' | '%') expression
  | expression ('+' | '-') expression
  | expression ('<<' | '>>') expression
  | expression '&' expression
  | expression '^' expression
  | expression '|' expression
  | expression ('<' | '>' | '<=' | '>=') expression
  | expression ('==' | '!=') expression
  | expression '&&' expression
  | expression '||' expression
  | expression '?' expression ':' expression
  | expression ('=' | '|=' | '^=' | '&=' | '<<=' | '>>=' | '+=' | '-=' | '*=' | '/=' | '%=') expression
  | primaryExpression ;

primaryExpression
  : BooleanLiteral
  | numberLiteral
  | HexLiteral
  | StringLiteral
  | identifier ('[' ']')?
  | TypeKeyword
  | tupleExpression
  | typeNameExpression ('[' ']')? ;

expressionList
  : expression (',' expression)* ;

nameValueList
  : nameValue (',' nameValue)* ','? ;

nameValue
  : identifier ':' expression ;

functionCallArguments
  : '{' nameValueList? '}'
  | expressionList? ;

functionCall
  : expression '(' functionCallArguments ')' ;

assemblyBlock
  : '{' assemblyItem* '}' ;

assemblyItem
  : identifier
  | assemblyBlock
  | assemblyExpression
  | assemblyLocalDefinition
  | assemblyAssignment
  | assemblyStackAssignment
  | labelDefinition
  | assemblySwitch
  | assemblyFunctionDefinition
  | assemblyFor
  | assemblyIf
  | BreakKeyword
  | ContinueKeyword
  | subAssembly
  | numberLiteral
  | StringLiteral
  | HexLiteral ;

assemblyExpression
  : assemblyCall | assemblyLiteral ;

assemblyCall
  : ( 'return' | 'address' | 'byte' | identifier ) ( '(' assemblyExpression? ( ',' assemblyExpression )* ')' )? ;

assemblyLocalDefinition
  : 'let' assemblyIdentifierOrList ( ':=' assemblyExpression )? ;

assemblyAssignment
  : assemblyIdentifierOrList ':=' assemblyExpression ;

assemblyIdentifierOrList
  : identifier | '(' assemblyIdentifierList ')' ;

assemblyIdentifierList
  : identifier ( ',' identifier )* ;

assemblyStackAssignment
  : '=:' identifier ;

labelDefinition
  : identifier ':' ;

assemblySwitch
  : 'switch' assemblyExpression assemblyCase* ;

assemblyCase
  : 'case' assemblyLiteral assemblyBlock
  | 'default' assemblyBlock ;

assemblyFunctionDefinition
  : 'function' identifier '(' assemblyIdentifierList? ')'
    assemblyFunctionReturns? assemblyBlock ;

assemblyFunctionReturns
  : ( '->' assemblyIdentifierList ) ;

assemblyFor
  : 'for' ( assemblyBlock | assemblyExpression )
    assemblyExpression ( assemblyBlock | assemblyExpression ) assemblyBlock ;

assemblyIf
  : 'if' assemblyExpression assemblyBlock ;

assemblyLiteral
  : StringLiteral | DecimalNumber | HexNumber | HexLiteral ;

subAssembly
  : 'assembly' identifier assemblyBlock ;

tupleExpression
  : '(' ( expression? ( ',' expression? )* ) ')'
  | '[' ( expression ( ',' expression )* )? ']' ;

typeNameExpression
  : elementaryTypeName
  | userDefinedTypeName ;

numberLiteral
  : (DecimalNumber | HexNumber) NumberUnit? ;

identifier
  : ('from' | 'calldata' | Identifier) ;

VersionLiteral
  : [0-9]+ '.' [0-9]+ '.' [0-9]+ ;

BooleanLiteral
  : 'true' | 'false' ;

DecimalNumber
  : ( DecimalDigits | (DecimalDigits? '.' DecimalDigits) ) ( [eE] DecimalDigits )? ;

fragment
DecimalDigits
  : [0-9] ( '_'? [0-9] )* ;

HexNumber
  : '0' [xX] HexDigits ;

fragment
HexDigits
  : HexCharacter ( '_'? HexCharacter )* ;

NumberUnit
  : 'wei' | 'szabo' | 'finney' | 'ether'
  | 'seconds' | 'minutes' | 'hours' | 'days' | 'weeks' | 'years' ;

HexLiteral : 'hex' ('"' HexPair* '"' | '\'' HexPair* '\'') ;

fragment
HexPair
  : HexCharacter HexCharacter ;

fragment
HexCharacter
  : [0-9A-Fa-f] ;

ReservedKeyword
  : 'abstract'
  | 'after'
  | 'case'
  | 'catch'
  | 'default'
  | 'final'
  | 'in'
  | 'inline'
  | 'let'
  | 'match'
  | 'null'
  | 'of'
  | 'relocatable'
  | 'static'
  | 'switch'
  | 'try'
  | 'typeof' ;

AnonymousKeyword : 'anonymous' ;
BreakKeyword : 'break' ;
ConstantKeyword : 'constant' ;
ContinueKeyword : 'continue' ;
ExternalKeyword : 'external' ;
IndexedKeyword : 'indexed' ;
InternalKeyword : 'internal' ;
PayableKeyword : 'payable' ;
PrivateKeyword : 'private' ;
PublicKeyword : 'public' ;
PureKeyword : 'pure' ;
TypeKeyword : 'type' ;
ViewKeyword : 'view' ;

Identifier
  : IdentifierStart IdentifierPart* ;

fragment
IdentifierStart
  : [a-zA-Z$_] ;

fragment
IdentifierPart
  : [a-zA-Z0-9$_] ;

StringLiteral
  : '"' DoubleQuotedStringCharacter* '"'
  | '\'' SingleQuotedStringCharacter* '\'' ;

fragment
DoubleQuotedStringCharacter
  : ~["\r\n\\] | ('\\' .) ;

fragment
SingleQuotedStringCharacter
  : ~['\r\n\\] | ('\\' .) ;

WS
  : [ \t\r\n\u000C]+ -> skip ;

COMMENT
  : '/*' .*? '*/' -> channel(HIDDEN) ;

LINE_COMMENT
  : '//' ~[\r\n]* -> channel(HIDDEN) ;

Once we have the grammar file, the next step is to build the code for Go. You can find a grammar file I used at https://github.com/ethereum/solidity/tree/develop/docs/grammar

Building the grammar

To build the grammar you need to download antlr-4.9.3-complete.jar tool and have JRE installed in your computer.

1
java -jar 'antlr-4.9.3-complete.jar' -Dlanguage=Go -listener -visitor -o parser ./SolidityLexer.g4
1
java -jar 'antlr-4.9.3-complete.jar' -Dlanguage=Go -listener -visitor -o parser ./Solidity.g4

NOTE: it is required to do following replacements in autogenerated grammar go code:

  • replace: type=typeName with varType=typeName
  • replace: String with StringLiteral

Adding test data

To include some test data to evaluate the detector, we need to define some Solidity code examples. The easiest way is to download some opensource solidity project from Github. In this case, I choose to use code snippets from https://solidity-by-example.org/first-app/. Our test example will be:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.13;

contract Counter {
    uint public count;

    // Function to get the current count
    function get() public view returns (uint) {
        return count;
    }

    // Function to increment count by 1
    function inc() public {
        count += 1;
    }

    // Function to decrement count by 1
    function dec() public {
        // This function will fail if count = 0
        count -= 1;
    }
}

We copy the content to a local first-app.sol file and store the content in our project ./testdata dir.

Building our Test before the implementation.

This is something known as TDD or Test Driven Development, in where one of the foundations is to build your code based on test collection data. In this scenarios, some test are required to be designed first, and then, the code is developed so they are all passed successfully.

In my case, I write the following test and basic empty function CheckVersion that will hold all the complexity.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
func TestDetector(t *testing.T) {
    t.Run("first-app-example", func(t *testing.T) {
        data, err := ioutil.ReadFile("testdata/first-app.sol")
        assert.NoError(t, err)
        assert.NotNil(t, data)
        result, err2 := CheckVersion(data)
        assert.NoError(t, err2)
        assert.NotNil(t, result)
        assert.True(t, result.Errored)
    })
}
1
2
3
4
func CheckVersion(code []byte) (VersionStatus, error) {
    var v VersionStatus
    return v, nil
}

If we run the test with go test, it should fail since we don’t have any valid code yet.

1
2
3
4
5
6
7
8
9
--- FAIL: TestDetector (0.00s)
    --- FAIL: TestDetector/first-app-example (0.00s)
        detector_test.go:19: 
                Error Trace:    /home/r00t/go/src/github.com/zerjioang/solidity-version-check/detector_test.go:19
                Error:          Should be true
                Test:           TestDetector/first-app-example
FAIL
exit status 1
FAIL    github.com/zerjioang/solidity-version-check     0.010s

Building our result data model

After the execution of the algorithm, the function CheckVersion should return some information about detection process. That information will be handled by struct VersionStatus defined as

1
2
3
type VersionStatus struct {
    Errored bool `json:"errored,omitempty"`
}

Adding basic ANTLR code to our function

Now that we have already defined the function input and output parameters as

1
func CheckVersion(code []byte) (VersionStatus, error)

is time to build the body. According to ANTLR documentation and some visited blogs out there like GopherAcademy, the basic steps to include are:

  1. Read input file content
  2. Build the lexer
  3. Build the token stream for lexer data
  4. Build a parser for token stream data
  5. Build a event listener for the parser
  6. Walk the parser tree

Previous steps, in code, are:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
func CheckVersion(code []byte) (VersionStatus, error) {
    var v VersionStatus

    // Setup the input
    is := antlr.NewInputStream(string(code))

    // Create the Lexer
    lexer := solidity.NewSolidityLexer(is)
    stream := antlr.NewCommonTokenStream(lexer, antlr.TokenDefaultChannel)

    // Create the Parser
    p := solidity.NewSolidityParser(stream)

    // Finally parse the expression
    var listener solidity.CustomSolidityListener
    antlr.ParseTreeWalkerDefault.Walk(&listener, p.SourceUnit())

    return v, nil
}

So, at this point the function CheckVersion has support to read and parse the input data according to Solidity provided grammar and walk the parse tree. But still, this is not enough to pass the test

1
2
3
4
5
6
detector_test.go:19: 
        Error Trace:    /home/r00t/go/src/github.com/zerjioang/solidity-version-check/detector_test.go:19
        Error:          Should be true
        Test:           TestDetector/first-app-example
--- FAIL: TestDetector (0.01s)
--- FAIL: TestDetector/first-app-example (0.01s)

Detecting Unlocked Compiler Versions Programmatically

This is where all our logic needs to be implemented. We need to find the right spot in the listener to implement this feature so that the algorithm is able to implement a detection mechanism and trigger some alarms. This steps requires to review and understand the grammar file. After some digging, we found that best point for our detection is this rule: the pragmaDirective.

1
pragmaDirective: Pragma PragmaToken+ PragmaSemicolon;

For this purpose, we implement a custom event logic in the pragmaDirective rule.

1
2
3
4
5
6
// EnterPragmaDirective is called when production pragmaDirective is entered.
func (s *CustomSolidityListener) EnterPragmaDirective(ctx *PragmaDirectiveContext) {
    // 1 read the content of the pragma
    // 2 check if its unlocked
    // 3 trigger an alert
}

Depending on the information we need to read, we need to call one method or another. For example:

  • ctx.GetText(): returns pragma solidity ^0.8.13;
  • ctx.PragmaToken(0): returns the first child of type PragmaToken
  • ctx.PragmaToken(0).GetText(): returns solidity ^0.8.13;

So with these tips in mind, you can now build your own simple if-else conditional to trigger an alert when ^solidity is found in the pragma declaration.

Unlocked compiler version alert reporting

After implementing the alert detection for unlocked pragmas, we can now report to the user. I choose to report via stdout as follows, but you can choose whatever method you want, for example: encoding result as JSON and exposing it to an API, sending an automated email notification, telegram message, etc.

1
2
3
4
5
6
Unlocked Compiler Version Detected
----------------------------------
Affected line (L2) : pragma solidity ^0.8.13;
Suggested fix      : pragma solidity 0.8.13;
Confidence         : Very High
Impact             : Informational

As you see, I also added a fix suggestion for the detected alert, which can help newcomers to solve the issue rapidly. Finally, we need to run the test again to see if it pass.

1
2
3
4
5
6
7
8
9
Unlocked Compiler Version Detected
----------------------------------
Affected line (L2) : pragma solidity ^0.8.13;
Suggested fix      : pragma solidity 0.8.13;
Confidence         : Very High
Impact             : Informational

PASS
ok      github.com/zerjioang/solidity-version-check     0.026s

And as always, the process needs to be fast. In this case, only 0.026 seconds were required for whole process.

Conclusion

I introduced you an easy workflow to start detecting issues in any programming language by just inspecing the source code structure by means of a parse tree evaluation. Obviously, this educational example has low complexity but in the same way, more complex detectors or source code analyzers can be built to trigger alarms on more complex bugs.

References



💬 Share this post in social media

Thanks for checking this out and I hope you found the info useful! If you have any questions, don't hesitate to write me a comment below. And remember that if you like to see more content on, just let me know it and share this post with your colleges, co-workers, FFF, etc.

Please, don't try to hack this website servers. Guess why...