Theory and Design of PL (CS 538)
April 29, 2020
If you want to know more, talk to Mark!
unsafe
Defines allowed, disallowed, and unspecified behaviors.
null
pointerbool
that is not true
or false
a = f(b) + g(c)
f
or g
?there are no restrictions on the behavior of the program.
Compilers are not required to diagnose undefined behavior (although many simple situations are diagnosed),
and the compiled program is not required to do anything meaningful.
ISO C++ forbids mutating string literals (ISO C++ §2.13.4p2)
Deferencing an invalid pointer is forbidden (ISO C §6.5.3.2p4)
fn
pointersbool
that isn’t true
or false
char
outside the ranges [0x0, 0xD7FF]
and [0xE000, 0x10FFFF]
str
fn do_the_foo_thing() {
let foo1 = Arc::new(Mutex::new(Foo(None)));
let foo2 = Arc::new(Mutex::new(Foo(None)));
// Reference cycle
foo1.lock().unwrap().0 = Some(Arc::clone(&foo2));
foo2.lock().unwrap().0 = Some(Arc::clone(&foo1));
// `foo1` and `foo2` are never dropped!
// Memory never freed. Foo::drop never called. No UB!
}
MAX_INT + 1
)In my program (Rust):
/// Read from file `fd` into buffer `buf`.
fn read_file(fd: i32, buf: &mut [u8]) {
let len = buf.len();
libc::read(fd, buf.as_mut_ptr(), len);
}
In libc
(C):
Compiler error: no unsafe C from safe Rust!
/// Read from the file descriptor into the buffer.
fn read_file(fd: i32, buf: &mut [u8]) {
let len = buf.len();
libc::read(fd, buf.as_mut_ptr(), len); // Compile error!
}
Ok, but how do we call C libraries or the OS?
unsafe
Compiler can’t check these: Be careful!
/// Read from the file descriptor into the buffer.
fn read_file(fd: i32, buf: &mut [u8]) {
let len = buf.len();
unsafe {
libc::read(fd, buf.as_mut_ptr(), len);
}
}
Rust compiles, but C code may do something bad: Be careful!
unsafe
mean?unsafe
blocks”unsafe
blocks”fn main() {
let mut my_vec = Vec::with_capacity(0); // empty vector
my_vec.set_len(100);
my_vec[30] = 0; // UB!
}
Huh?!? UB in safe Rust? How?
unsafe fn
impl Vec {
/// Sets the length of the vector to `new_len`.
pub unsafe fn set_len(&mut self, new_len: usize) {
self.len = new_len;
}
}
Can only be called in an unsafe block!
But why is it possible in the first place?
bool
is always true
or false
unsafe
, breaking program invariants can break lang. invariants, leading to UBLanguage invariant: no accesses to invalid memory
Program invariant: len
is no longer than buf
Bad use of Vec::set_len
violates program invariant => access memory out of bounds == UB.
Not sufficient to just look in unsafe
blocks!
unsafe
: someone promises to uphold invariants!
“Promise” is called a proof obligation.
unsafe
unsafe { ... }
blocks
unsafe fn
unsafe
unsafe trait
and unsafe impl
Idea: Abstraction hides unsafe
unsafe
to expose dangerous interfacesVec
Using only safe methods of Vec
, it is impossible to cause UB, even though Vec
uses unsafe
internally.
Vec
all uphold invariants.unsafe
(e.g. set_len
)fn main() -> std::io::Result<()> {
// Open: call libc and OS. Safely!
let file = File::open("foo.txt")?;
let mut buf_reader = BufReader::new(file);
let mut contents = String::new();
// Read: call libc and OS. Safely!
buf_reader.read_to_string(&mut contents)?;
assert_eq!(contents, "Hello, world!");
Ok(())
// Close: call libc and OS. Safely!
}
File
, BufReader
are safe abstractions that uphold invariants about files, memory, etc.
#[repr(C)]
and #[repr(packed)]
Vec
*const T
and *mut T
unsafe
to dereferencestd::ptr
NonNull
impl Vec
impl Vec
impl RawVec
impl Vec
impl Vec
pub fn push(&mut self, value: T) {
// Are we out of space?
if self.len == self.buf.cap {
self.buf.double(); // alloc more space
}
// put the element in the `Vec`
unsafe {
// compute address of end of buffer
let end = self.buf.ptr.offset(self.len);
ptr::write(end, value); // write data to raw pointer
self.len += 1; // increase length
}
}
impl RawVec
impl RawVec
unsafe
tools#[repr(...)]
extern fn
Strings, variadic fns (e.g. printf
), extern
types