FFI exercise: Binding to LevelDB

In this exercise your task is to create a Rust binding, or foreign function interface (FFI) to the LevelDB database library. Typically, and also in this exercise, "foreign" means "C".

You will learn how to:

  • handle pointers passed to or from the foreign language
  • use low-level C bindings
  • utilize Rust's ownership system to provide safety on top ofΒ those raw primitives

Prerequisites

This exercise requires knowledge of a few Rust and C concepts:

Conceptually, LevelDB is a key-value store, AKA a persistent dictionary or map.

Mental exercise: Ownership in C

How is ownership handled in C?

Hint When does a "double free" occur?
Solution Ownership is handled only informally - typically an API's documentation and/or function names (e.g. "create", "new") will indicate whether you are responsible to free up the memory passed to you, or it is somebody else's problem. Unclear ownership (via multiple mutable pointers to the same memory), API misunderstandings or other kinds of human error can easily lead to memory being freed too often or too little, resulting in crashes or leaks.

Setup

Required dependencies

Binding to C is divided into two parts: a minimal low-level interface (called "sys crate") and a higher level wrapper crate.

The sys crate is responsible for linking to the C library and exposing its contents unchanged.

The higher level crate uses the sys crate to provide a more Rust-friendly interface by safely wrapping the inherently unsafe raw parts.

Writing a sys crate yourself is beyond the scope of this exercise. We will be using the leveldb-sys crate provided for you.

Additionally, you'll also need the libc crate which provides C types and other required definitions.

To use them, you will need to specify leveldb-sys and libc in the [dependencies] section of your project's Cargo.toml.

# in Cargo.toml

[dependencies]
+leveldb-sys = "*"
+libc = "*"
❗ Note on specifying dependencies using the asterisk Declaring a dependency as * ("any version") is generally not recommended - we're doing it here to prevent stale version numbers and assume it's safe because these specific crates are very unlikely to introduce breaking changes. An alternative would be to install cargo-edit and then use the contained cargo-add command to add a dependency on the most recent release version:
$ cargo add leveldb-sys
$ cargo add libc

Building leveldb-sys requires CMake and a C++ compiler (gcc, clang, Visual Studio etc.) to be installed on your system.

πŸ”Ž Should you ever need to write your own sys crate you can find instructions for doing so here.

Exercise: Opening and closing a database

Preparation

Conceptually, a LevelDB database is a directory (the "database name") where all its required files are stored. "Opening" it means passing a path (whose last part is the database name) and some options to leveldb_open, notably create_if_missing, which will create the database directory in case it does not exist.

You'll also need these functions and enums from the leveldb-sys crate:

The LevelDB C header documents some conventions used by its implementation.

Your tasks

If you're stuck, check out the Help and hints below!

βœ… Create a new library package project for this group of exercises. Add the required dependencies.

βœ… Implement functions for:

  • opening a database, forwarding the "create if missing" flag
  • closing it again

βœ… Refactor your code and create wrapping structs for the raw leveldb_t and leveldb_options_t types, taking care of the required cleanup operations.

  • It should not be possible to forget the cleanup.
  • The struct member storing the pointer should use an appropriate Rust type to express the fact that it exclusively stores non-null pointers.

βœ… Create a Database struct that manages high-level operations.

Polishing your solution

How's your error handling? open should not panic - return a custom error instead. To keep things concise, make use of the question mark operator (see also the following section on using it in tests). Convert errors where necessary.

βœ… Test the success and error cases.

  • Note: you can provoke an error by trying to open a database folder to which you don't have write access

βœ… What type(s) does your open function accept as database names? What would offer the most flexibility? Get some inspiration from std::fs::File.

βœ… Include the underlying LevelDB error message string in your error type. Mind the ownership!

βœ… Rust Strings are valid UTF-8. What issues might occur

  • as opposed to C strings?
  • regarding valid file system characters? (a LevelDB database is a directory!)

The most straightforward parameter type for the database name is &str (why not String?). Since we're dealing with paths, what would be an alternative that still has the convenience of using string literals on the caller side?

βœ… Change your function signature accordingly.

Hint Which trait bounding provides the required functionality?

Help and hints

  • Getting mysterious crashes? Are you maybe dereferencing an uninitialized raw pointer?
  • c_uchar is an alias for u8, and Rust booleans are safe to cast as u8.
  • You need to pass an error pointer to open. The C library potentially mutates it, and initially it should point to null. For creating a suitable pointer Rust provides you with std::ptr::null_mut.
  • A null pointer is equal (==) to any other null pointer. Raw pointers also offer is_null() for checking.
  • If an error occurs you own the C string containing the error message. You can either free it or reuse it for future calls - which option is more convenient?
  • When open succeeds, it gives you a valid, non-null pointer back. A natural mapping for this type is NonNull::new_unchecked.
  • Owned C strings can be created with std::ffi::CString. Note that unlike C strings, Rust strings can contain null bytes.
  • To handle paths there's std::path::Path. For simplicity reasons, assume paths are valid UTF-8, which in the real world isn't always the case.
  • Errors can be converted with map_err and From::from.
  • String types implement AsRef<Path>, which makes it a good fit for path parameters.

Testing

LevelDB, being a database, persists data to disk. When writing tests for your binding, creating this data in a temporary fashion is appropriate, saving you from doing cleanup work yourself. The tempdir crate provides this functionality, but if you added it to [dependencies] it would also be installed for every user of your library, even if they didn't intend to run your tests. Fortunately, Cargo has a [dev-dependencies] section for crates that are only required during development:

[dev-dependencies]
tempdir = "*"

Basic template

You can use this code skeleton to get started:


#![allow(unused)]
fn main() {
use leveldb_sys::*;
use std::ptr;
use std::ffi::CString;

#[test]
fn basic_template() {
    let options = unsafe { leveldb_options_create() };

    unsafe { leveldb_options_set_create_if_missing(options, true as u8) };

    let mut err = ptr::null_mut();

    let name = CString::new("my_db").unwrap();

    let (db_ptr, err_ptr) = unsafe {
        let db_ptr = leveldb_open(
            options,
            name.as_ptr(),
            &mut err,
        );

        (db_ptr, err)
    };

    unsafe { leveldb_options_destroy(options) };

    if err_ptr == ptr::null_mut() {
        unsafe { leveldb_close(db_ptr) }
    } else {
        unsafe {
            println!("Error opening database: {}", *err_ptr);
        }
    }
}
}