suggest::util

Function filter_map_chunks

source
pub fn filter_map_chunks<T: Clone>(
    words: &[&str],
    max_chunk_size: usize,
    f: impl Fn(&str, usize, &[T]) -> Result<Option<Vec<T>>, Error>,
) -> Result<Vec<Vec<T>>, Error>
Expand description

Performs a depth-first traversal over all possible chunk sequences in a slice, applies a filter-map function to each chunk in each sequence, and collects the filter-mapped sequences in a Vec.

“Chunks” are non-overlapping subslices of the parent slice as described in slice::chunks().

WARNING: This function potentially does an exponential amount of work! You should always be careful to prune the traversal space by returning None from your mappper function, as described further below, when a chunk does not match what you are searching for.

max_chunk_size controls the maximum chunk size (in number of words), which influences the branching factor at each step in the traversal.

At each traversal step, the filter-map function is called like: f(chunk, chunk_index, chunk_size, path).

chunk is the chunk at that step, chunk_index is its index in the parent words slice, and chunk_size is its size in words. The function can map the chunk to one or more values. Each value expands the branching factor at the current step by max_chunk_size. In other words, the branching factor at a given traversal step is max_chunk_size multiplied by the number of values returned by the filter-map function at that step. path is the path of mapped values that has been travsersed at that step: a sequence of mapped values corresponding to chunks in the parent words slice.

The filter-map function can return None to halt traversal at the current step. Returning None sets the branching factor at that step to zero, pruning the subtree rooted at that step from the traversal space and discarding the path from the output. This is important for keeping traversal reasonably bounded.

Traversal ends and the function returns when all paths have been visited. The returned Vec will contain all traversal paths that weren’t pruned.

§Examples

Mapping chunks in ["a", "b", "c"] to uppercase, up to a max chunk size of 3:

let paths = filter_map_chunks(&["a", "b", "c"], 3, |chunk, _, _| {
    Ok(Some(vec![chunk.to_uppercase()]))
});
assert_eq!(paths.unwrap(), vec![
    vec!["A", "B", "C"],
    vec!["A", "B C"],
    vec!["A B", "C"],
    vec!["A B C"]
]);

Same as previous but using chunk_index in the filter-map function to prune paths that don’t start with "a":

let paths = filter_map_chunks(&["a", "b", "c"], 3, |chunk, chunk_index, _| {
    if chunk_index > 0 || chunk == "a" {
        Ok(Some(vec![chunk.to_uppercase()]))
    } else {
        Ok(None)
    }
});
assert_eq!(paths.unwrap(), vec![
    vec!["A", "B", "C"],
    vec!["A", "B C"],
]);

Same as the first example but using path in the filter-map function to prune paths that include “A B”:

let paths = filter_map_chunks(&["a", "b", "c"], 3, |chunk, _, path| {
    if path.iter().any(|value| value == "A B") {
        Ok(None)
    } else {
        Ok(Some(vec![chunk.to_uppercase()]))
    }
});
assert_eq!(paths.unwrap(), vec![
    vec!["A", "B", "C"],
    vec!["A", "B C"],
    vec!["A B C"],
]);

Mapping each chunk to multiple values:

let paths = filter_map_chunks(&["a", "b", "c"], 3, |chunk, _, _| {
    Ok(Some(vec![format!("{chunk}0"), format!("{chunk}1")]))
});
assert_eq!(paths.unwrap(), vec![
    vec!["a0", "b0", "c0"],
    vec!["a0", "b0", "c1"],
    vec!["a0", "b1", "c0"],
    vec!["a0", "b1", "c1"],
    vec!["a0", "b c0"],
    vec!["a0", "b c1"],
    vec!["a1", "b0", "c0"],
    vec!["a1", "b0", "c1"],
    vec!["a1", "b1", "c0"],
    vec!["a1", "b1", "c1"],
    vec!["a1", "b c0"],
    vec!["a1", "b c1"],
    vec!["a b0", "c0"],
    vec!["a b0", "c1"],
    vec!["a b1", "c0"],
    vec!["a b1", "c1"],
    vec!["a b c0"],
    vec!["a b c1"]
]);