Ruby 里 set_trace_func 歪曲真相的欺骗性行为set_trace_func smoke and mirrors

This is a translation of set_trace_func, smoke and mirrors which was originally written in English thus the original version is not available on my blog.
本文是对 set_trace_func, smoke and mirrors 的粗略翻译,如有错误请务必指正。

Ruby 解释器提供了可用于追踪代码执行的事件钩子。用 set_trace_func(some_proc) 可以设置钩子,用 set_trace_func(nil) 卸载钩子。这个东西,有时候甚至远比基于单步和断点的调试器有用。

Ruby 会追踪这 8 种事件:

line
文件或行改变
class
模块或类定义开始
end
模块或类定义结束
call
Ruby 方法调用
return
Ruby 方法返回
c-call
C 方法调用
c-return
C 方法返回
raise
异常抛出

并不是所有事件都会被追踪,例如设置和读取变量(局部、全局、实例或者类变量)就不行。

1 2 3 4 5 6 7 8 9
set_trace_func proc { |event, file, line, id, binding, classname|
STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname
}
@welcome = "Hello"
separator = ", "
$object = "world"
@@excl = "!"
puts(@welcome + separator + $object + @@excl)
example-1.rb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
$ ./example-1.rb
line ./example-1.rb:7 false
line ./example-1.rb:8 false
line ./example-1.rb:9 false
line ./example-1.rb:10 false
line ./example-1.rb:11 false
c-call ./example-1.rb:11 + String
c-return ./example-1.rb:11 + String
c-call ./example-1.rb:11 + String
c-return ./example-1.rb:11 + String
c-call ./example-1.rb:11 + String
c-return ./example-1.rb:11 + String
c-call ./example-1.rb:11 puts Kernel
c-call ./example-1.rb:11 write IO
c-return ./example-1.rb:11 write IO
c-call ./example-1.rb:11 write IO
c-return ./example-1.rb:11 write IO
c-return ./example-1.rb:11 puts Kernel

例子中,String#+ 运行了三次,IO#write 两次——第二次大概是输出换行符的。

Ruby 中很多东西很神奇并没有被转换为方法调用。这种东西的一种是字符串插值(... #{ code } ...),并没有被转换为 String#+ 或是 Array#join,而是直接由解释器特别的处理。

1 2 3 4 5 6
set_trace_func proc { |event, file, line, id, binding, classname|
STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname
}
@welcome = "Hello"
$object = "world"
puts("#{@welcome}, #{$object}\n")
example-2.rb
1 2 3 4 5 6 7 8
$ ./example-2.rb
line ./example-2.rb:7 false
line ./example-2.rb:8 false
line ./example-2.rb:9 false
c-call ./example-2.rb:9 puts Kernel
c-call ./example-2.rb:9 write IO
c-return ./example-2.rb:9 write IO
c-return ./example-2.rb:9 puts Kernel

同样,创建类并没有被转化为 Class#new,添加方法并没有被转换为 Module#define_method。幸运的是 Ruby 提供了特别的钩子让我们需要捕捉这些事件——当类创建时,其父类的 inherited 方法会被调用;当一个方法被添加时,Ruby 调用其类的 method_added

1 2 3 4 5 6 7 8 9 10 11 12 13
set_trace_func proc { |event, file, line, id, binding, classname|
STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname
}
class Geolocation
def initialize(lat, long)
@lat = lat
@long = long
end
def to_s
"<#{@lat}, #{@long}>"
end
end
example-3.rb
1 2 3 4 5 6 7 8 9 10 11 12
$ ./example-3.rb
line ./example-3.rb:7 false
c-call ./example-3.rb:7 inherited Class
c-return ./example-3.rb:7 inherited Class
class ./example-3.rb:7 false
line ./example-3.rb:8 false
c-call ./example-3.rb:8 method_added Module
c-return ./example-3.rb:8 method_added Module
line ./example-3.rb:12 false
c-call ./example-3.rb:12 method_added Module
c-return ./example-3.rb:12 method_added Module
end ./example-3.rb:7 false

如果我们想跟踪些有用的东西,不如先换一下输出格式。毕竟我们对跟踪执行过程感兴趣,通常来说文件及行号都没什么用,return/c-return/end 则应该用缩进代替。同时,相比类,我们更注重的是接收者(self)。

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
$indent = 0
set_trace_func proc { |event, file, line, id, binding, classname|
if event == "line"
# 忽略
elsif %w[return c-return end].include?(event)
$indent -= 2
else
if event == "class"
STDERR.printf "%*s%s %s\n", $indent, "", event, eval("self", binding)
else
STDERR.printf "%*s%s %s.%s\n", $indent, "", event, eval("self", binding), id
end
$indent += 2 if %w[call c-call class].include?(event)
end
}
class Geolocation
def initialize(lat, long)
@lat = lat
@long = long
end
def to_s
"<#{@lat}, #{@long}>"
end
end
a = Geolocation.new(51.12, 17.03)
puts a
p a
example-4.rb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
$ ./example-4.rb > /dev/null
c-call Object.inherited
class Geolocation
c-call Geolocation.method_added
c-call Geolocation.method_added
c-call Geolocation.new
call <, >.initialize
c-call main.puts
call <51.12, 17.03>.to_s
c-call 51.12.to_s
c-call 17.03.to_s
c-call #<IO:0xb7c80fc0>.write
c-call #<IO:0xb7c80fc0>.write
c-call main.p
c-call <51.12, 17.03>.inspect
c-call 17.03.inspect
c-call 17.03.to_s
c-call 51.12.inspect
c-call 51.12.to_s
c-call #<IO:0xb7c80fc0>.write
c-call #<IO:0xb7c80fc0>.write

这里的输出又有点奇怪。在 Geolocation#initialize 开始时, Geolocation#to_s 就被追踪器调用,而不是在其有意义之后。这种情况下,我们一不小心就得到了垃圾 <, >。但是有些对象,在初始化之前被打印是会产生异常的。

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
require "complex"
$indent = 0
set_trace_func proc { |event, file, line, id, binding, classname|
if event == "line"
# 忽略
elsif %w[return c-return end].include?(event)
$indent -= 2
else
obj = eval("self", binding)
if event == "class"
STDERR.printf "%*s%s %s\n", $indent, "", event, obj
else
obj = "<#{obj.class}##{obj.object_id}>" if id == :initialize
STDERR.printf "%*s%s %s.%s\n", $indent, "", event, obj, id
end
$indent += 2 if %w[call c-call class].include?(event)
end
}
a = Complex.new(11.0, -5.0)
b = Complex.new(2.0, 13.5)
c = a * b
example-5.rb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
$ ./example-5.rb
c-call Complex.new
call <Complex#-605829048>.initialize
c-call 11.0.kind_of?
c-call 11.0.kind_of?
c-call -5.0.kind_of?
c-call -5.0.kind_of?
c-call Complex.new
call <Complex#-605832038>.initialize
c-call 2.0.kind_of?
c-call 2.0.kind_of?
c-call 13.5.kind_of?
c-call 13.5.kind_of?
call 11.0-5.0i.*
c-call 2.0+13.5i.kind_of?
c-call 11.0.*
c-call -5.0.*
c-call 22.0.-
c-call 11.0.*
c-call -5.0.*
c-call 148.5.+
call 11.0-5.0i.Complex
c-call 138.5.==
call 89.5.real
call 138.5.imag
c-call 89.5.-
call 89.5.imag
call 138.5.real
c-call 0.+
c-call Complex.new
call <Complex#-605842188>.initialize
c-call 89.5.kind_of?
c-call 89.5.kind_of?
c-call 138.5.kind_of?
c-call 138.5.kind_of?

很多平常难以发现的错误,自从有了 set_trace_funcgrep 和一些处理日志的单行 Ruby 脚本的帮助,就变得微不足道了……

歪曲真相的欺骗性行为

set_trace_func 不仅仅对调试有用,也适合做各种有趣的事情。Binding.of_caller 是用 set_trace_func 实现的,虽然已经不能用了。magic/help 用了 set_trace_funcirb 中提供方便的帮助:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
irb> help { STDERR.write }
--------------------------------------------------------------- IO#write
ios.write(string) => integer
------------------------------------------------------------------------
Writes the given string to _ios_. The stream must be opened for
writing. If the argument is not a string, it will be converted to a
string using +to_s+. Returns the number of bytes written.
count = $stdout.write( "This is a test\n" )
puts "That was #{count} bytes of data"
_produces:_
This is a test
That was 15 bytes of data

magic/help 通过设置一个 set_trace_func 钩子工作,执行传送过来的块,然后一遇到感兴趣的事件就停止执行。这意味着 help { rm_rf "/" } 完全是安全的。

我开发 magic/help 时遇到的一个难懂的问题是,方法没有按正确的参数个数调用时,追踪结果不一致。

对于 C 写的方法,就算显式声明了参数个数,Ruby 依然在 ArgumentError.new 前产生 c-call。另一方面,对于 Ruby 写的方法,ArgumentError.new 被调用,然后从函数 return 返回,而不产生 call

要描述这个问题,我们来跟踪一下 C 方法 STDERR.write

1 2 3 4 5 6 7 8 9 10 11
c-call #<IO:0xb7c98fa8>.write
c-call ArgumentError.new
c-call #<ArgumentError: ArgumentError>.initialize
c-return #<ArgumentError: wrong number of arguments (0 for 1)>.initialize
c-return ArgumentError.new
c-call #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
c-return #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
c-call #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
c-return #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
raise #<IO:0xb7c98fa8>.write
c-return #<IO:0xb7c98fa8>.write

与此同时我们跟踪一下 Ruby 方法 FileUtils.remove_entry:

1 2 3 4 5 6 7 8 9 10 11 12
# "call FileUtils.remove_entry" 永远不会产生!
c-call ArgumentError.new
c-call #<ArgumentError: ArgumentError>.initialize
c-return #<ArgumentError: wrong number of arguments (0 for 1)>.initialize
c-return ArgumentError.new
c-call #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
c-return #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
c-call #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
c-return #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
raise FileUtils.remove_entry
# "return FileUtils.remove_entry" 确实产生了。
return FileUtils.remove_entry

两种情况下,参数都是在调用方法之前被检查的。IO#write 是定义为一个参数的方法:

1
rb_define_method(rb_cIO, "write", io_write, 1);

新版的 magic/help 也能处理这个情况了——如果首事件是 c-callArgumentError.newmagic/help 就等到首个 return 事件(如果有)而不是立刻放弃执行传来的块。快去下载新版,享受神奇的帮助吧。

译者注

与此同时,也推荐晴兰的利用 set_trace_func 方法来判断一个方法是否被 alias 过给您。

Version History版本历史

  • c71529c Add icon in set_trace_fun
  • 7db1ff3 Fix typos in set_trace_func
  • 5b17509 Add See Also to set_trace_fun.
  • 0143959 I18N changes: Add language field to posts
  • 0f1cf5d Add Fist code box
  • e34fb6f Add syntax highlighter
  • 4a8e516 Rename categories: stuff -> data, coding -> tech
  • e1cb087 Fix some formatting & broken links; remove yourls_shorturl, categories and tags
  • c1bc9db Fix some formatting & broken links; remove yourls_shorturl, categories and tags
  • 5b93ba5 Remove author & delete some broken posts
  • 1ee024d Import posts from my WordPress blog