Skip to Content
RustBooksThe Rust BookCh 8. Common collections

Ch 8. Common collections

8.1 Storing Lists of Values with Vectors

Vectors (Vec<T>) are a powerful collection type in Rust that allow you to store multiple values of the same type in a contiguous block of memory. They are dynamic, meaning their size can grow or shrink as needed.

let v: Vec<i32> = Vec::new();
  • This creates an empty vector that can store i32 values.
  • You need to specify the type (Vec<i32>) because the vector is empty, so Rust cannot infer the type.
let v = vec![1, 2, 3];
  • The vec! macro creates a vector with the given values.
  • Here, Rust infers the type Vec<i32> from the values provided, so no type annotation is needed.
let mut v = Vec::new(); v.push(5); v.push(6); v.push(7);
  • The vector must be mutable (mut) to allow modification.
  • push adds elements to the end of the vector.
let v = vec![1, 2, 3, 4, 5]; let third: &i32 = &v[2]; // Accesses the third element (index 2) println!("The third element is {third}");
  • Indexing uses square brackets and returns a reference to the element.
  • Caution: If the index is out of bounds, the program will panic and crash.
let third: Option<&i32> = v.get(2); // Safely accesses the third element match third { Some(value) => println!("The third element is {value}"), None => println!("There is no third element."), }
  • get method returns an Option<&T>, which allows you to handle cases where the index might be out of bounds without panicking.
  • If the element exists, Some(&element) is returned; otherwise, None is returned.
let v = vec![1, 2, 3, 4, 5]; let does_not_exist = &v[100]; // Causes panic! let does_not_exist_safe = v.get(100); // Returns None
  • Using indexing with an invalid index (e.g., 100 in a vector of 5 elements) will cause a panic and crash the program.

  • The get method returns None if the index is invalid, allowing you to handle errors gracefully.

  • Borrowing and Ownership with Vectors

    • Rust’s borrowing rules ensure safe memory access. You cannot have both mutable and immutable references to a vector within the same scope.
    let mut v = vec![1, 2, 3, 4, 5]; let first = &v[0]; // Immutable reference to the first element v.push(6); // Attempt to mutate the vector println!("The first element is: {first}"); // Error: immutable borrow used after mutation
    • This code does not compile because adding a new element might reallocate memory, invalidating the reference to the first element.
    • Rust prevents this by enforcing borrowing rules to avoid potential memory safety issues.
  • Iterating over values in a vector:

    • Use a for loop to access each element in a vector without needing indices.

    • Example of iterating over immutable references:

      let v = vec![100, 32, 57]; for i in &v { println!("{i}"); }
    • For mutable references, use a for loop to modify each element:

      let mut v = vec![100, 32, 57]; for i in &mut v { *i += 50; }
    • To modify the value of a mutable reference, use the dereference operator (*) to access the value.

  • Safety of iteration:

    • Borrow checker rules prevent simultaneous modification during iteration.
    • Attempting to insert or remove items in the loop would result in a compiler error.
  • Using enums to store multiple types:

    • Vectors require elements to be of the same type, but enums can represent different types within a vector.

    • Example:

      enum SpreadsheetCell { Int(i32), Float(f64), Text(String), } let row = vec![ SpreadsheetCell::Int(3), SpreadsheetCell::Text(String::from("blue")), SpreadsheetCell::Float(10.12), ];
    • Using enums ensures Rust knows the types at compile time and avoids potential errors.

  • Limitations of using enums:

    • If the types to be stored are not known at compile time, using an enum won’t work.
    • A trait object can be used instead, covered in Chapter 17.
  • Dropping a vector:

    • A vector is automatically dropped when it goes out of scope.

      { let v = vec![1, 2, 3, 4]; // do stuff with v } // <- v is dropped here
    • The vector’s elements are also dropped when the vector is dropped.

    • The borrow checker ensures references to vector contents are valid only while the vector itself is valid.

8.2 Storing UTF-8 Encoded Text with Strings

  • Strings in Rust are collections of bytes, with additional methods to interpret them as text.

  • String manipulation in Rust is more complex due to:

    • Rust’s strict handling of potential errors.
    • Strings being a complex data structure, especially with UTF-8 encoding.
  • What is a String in Rust?:

    • Two main types:
      • &str (string slice): Borrowed, immutable references to UTF-8 encoded data.
      • String: Growable, mutable, and owned string type provided by Rust’s standard library.
    • Both String and &str are UTF-8 encoded.
  • Creating a New String:

    • Create an empty string using String::new():

      let mut s = String::new();
    • Create a string with initial content using to_string or String::from:

      let s = "initial contents".to_string(); let s = String::from("initial contents");
  • UTF-8 Encoded Strings:

    • Strings can store text in any valid UTF-8 format:

      let hello = String::from("Hello"); let hello = String::from("こんにちは"); let hello = String::from("Hola");
  • Updating a String:

    • Grow a string using push_str to append a string slice:

      let mut s = String::from("foo"); s.push_str("bar"); // Results in "foobar"
    • Append a single character using push:

      let mut s = String::from("lo"); s.push('l'); // Results in "lol"
  • Concatenating Strings:

    • Use the + operator to concatenate strings:

      let s1 = String::from("Hello, "); let s2 = String::from("world!"); let s3 = s1 + &s2; // s3 contains "Hello, world!", but s1 is no longer valid.
    • The + operator takes ownership of the first string (s1), while the second string (s2) is borrowed.

    • Use the format! macro for more complex string concatenation:

      let s = format!("{s1}-{s2}-{s3}"); // Easier to read than using multiple `+` operators.
  • Indexing into Strings:

    • Rust doesn’t allow direct indexing into strings (e.g., s[0]).

    • Strings are stored as a sequence of bytes, and indexing could lead to invalid access due to variable byte lengths in UTF-8 characters.

    • Instead, use methods like .chars().nth() to access individual characters:

      let s = String::from("hello"); let h = s.chars().nth(0); // Returns Some('h')
  • Slicing Strings:

    • Indexing into a string with a single number is discouraged due to ambiguity about the return type (byte, character, grapheme cluster, or slice).

    • To create string slices, use a range (e.g., [0..4]) for a precise slice of bytes:

      let hello = "Здравствуйте"; let s = &hello[0..4]; // s contains "Зд"
    • Slicing at invalid byte boundaries (e.g., &hello[0..1]) will result in a runtime panic, as Rust ensures slices align with valid UTF-8 characters.

  • Caution with String Slicing:

    • Slicing at improper character boundaries causes runtime panics:

      thread 'main' panicked at byte index 1 is not a char boundary
    • Always ensure your ranges respect character boundaries to avoid crashes.

  • Methods for Iterating Over Strings:

    • Use .chars() to iterate over individual Unicode scalar values (char):

      for c in "Зд".chars() { println!("{c}"); }
      • Output:

        З д
    • Use .bytes() to iterate over raw bytes:

      for b in "Зд".bytes() { println!("{b}"); }
      • Output:

        208 151 208 180
    • Grapheme clusters (what users perceive as single characters) are more complex and not supported directly by the standard library. External crates are needed to work with grapheme clusters.

  • String Complexity in Rust:

    • Rust exposes more of the complexity of UTF-8 string handling than many other languages, ensuring correct handling from the start.
    • This upfront complexity prevents errors related to non-ASCII characters later in development.
  • Useful String Methods:

    • contains: Search for a substring within a string.

    • replace: Substitute parts of a string with another string.

      let s = "Hello world".replace("world", "Rust"); println!("{s}"); // Output: Hello Rust

8.3 Hash Maps in Rust

  • The HashMap<K, V> type stores key-value pairs using a hashing function.

  • Keys can be of any type, and values can be accessed by key instead of index (unlike vectors).

  • Use cases include scenarios like tracking scores in a game, where the key is a team name and the value is the score.

  • Creating a Hash Map:

    • Create an empty hash map using HashMap::new() and insert key-value pairs with insert:

      use std::collections::HashMap; let mut scores = HashMap::new(); scores.insert(String::from("Blue"), 10); scores.insert(String::from("Yellow"), 50);
  • Accessing Values:

    • Use the get method to retrieve values from the hash map:

      let team_name = String::from("Blue"); let score = scores.get(&team_name).copied().unwrap_or(0);
      • get returns an Option<&V>. Use copied() to get an Option<V>, and unwrap_or(0) to return 0 if the key isn’t present.
  • Iterating Over a Hash Map:

    • Iterate over key-value pairs using a for loop:

      for (key, value) in &scores { println!("{key}: {value}"); }
  • Ownership and Hash Maps:

    • For types implementing the Copy trait (e.g., i32), values are copied into the hash map.

    • For owned types like String, values are moved into the hash map, and the map takes ownership:

      let field_name = String::from("Favorite color"); let field_value = String::from("Blue"); let mut map = HashMap::new(); map.insert(field_name, field_value); // field_name and field_value are invalid after this point
  • Updating Values in a Hash Map:

    1. Overwriting a Value:

      • If the same key is inserted twice, the value is replaced:

        scores.insert(String::from("Blue"), 25); // Overwrites "Blue": 10 with "Blue": 25
    2. Inserting If Key Is Absent:

      • Use entry and or_insert to insert a value only if the key isn’t present:

        scores.entry(String::from("Yellow")).or_insert(50);
    3. Updating Based on Old Value:

      • Use entry to update a value based on the old value, e.g., counting word occurrences:

        let text = "hello world wonderful world"; let mut map = HashMap::new(); for word in text.split_whitespace() { let count = map.entry(word).or_insert(0); *count += 1; }
  • Hashing Functions:

    • By default, HashMap uses the SipHash algorithm, which provides security against DoS attacks.
    • For performance-sensitive applications, you can specify a different hasher that implements the BuildHasher trait. Various libraries on crates.io provide alternative hashers.
Last updated on