Supporting a new language

This section is to help developers implement support for a new language in rust-code-analysis.

To implement a new language, two steps are required:

  1. Generate the grammar
  2. Add the grammar to rust-code-analysis

A number of metrics are supported and help to implement those are covered elsewhere in the documentation.

Generating the grammar

As a prerequisite for adding a new grammar, there needs to exist a tree-sitter version for the desired language that matches the version used in this project.

The grammars are generated by a project in this repository called enums. The following steps add the language support from the language crate and generate an enum file that is then used as the grammar in this project to evaluate metrics.

  1. Add the language specific tree-sitter crate to the enum crate, making sure to tie it to the tree-sitter version used in the ruse-code-analysis crate. For example, for the Rust support at time of writing the following line exists in the /enums/Cargo.toml: tree-sitter-rust = "version number".
  2. Append the language to the enum crate in /enums/src/languages.rs. Keeping with Rust as the example, the line would be (Rust, tree_sitter_rust). The first parameter is the name of the Rust enum that will be generated, the second is the tree-sitter function to call to get the language's grammar.
  3. Add a case to the end of the match in mk_get_language macro rule in /enums/src/macros.rs eg. for Rust Lang::Rust => tree_sitter_rust::language().
  4. Lastly, we execute the /recreate-grammars.sh script that runs the enums crate to generate the grammar for the new language.

At this point we should have a new grammar file for the new language in /src/languages/. See /src/languages/language_rust.rs as an example of the generated enum.

Adding the new grammar to rust-code-analysis

  1. Add the language specific tree-sitter crate to the rust-code-analysis project, making sure to tie it to the tree-sitter version used in this project. For example, for the Rust support at time of writing the following line exists in the Cargo.toml: tree-sitter-rust = "0.19.0".
  2. Next we add the new tree-sitter language namespace to /src/languages/mod.rs eg.

# #![allow(unused_variables)]
#fn main() {
pub mod language_rust;
pub use language_rust::*;
#}
  1. Lastly, we add a definition of the language to the arguments of mk_langs! macro in /src/langs.rs.

# #![allow(unused_variables)]
#fn main() {
// 1) Name for enum
// 2) Language description
// 3) Display name
// 4) Empty struct name to implement
// 5) Parser name
// 6) tree-sitter function to call to get a Language
// 7) file extensions
// 8) emacs modes
(
    Rust,
    "The `Rust` language",
    "rust",
    RustCode,
    RustParser,
    tree_sitter_rust,
    [rs],
    ["rust"]
)
#}