is now out with a new parser and renderer, addressing several existing issues
and ensuring compliance with the TOML 1.0 compliance tests. This was done by
Last March, alexcrichton put out a call for a maintainer for the
toml crate. I had become a maintainer of
toml_edit crate as part of my work
cargo add, getting
toml_edit in shape that it became the only TOML parser in
rust-lang/cargo#10086) as the
cargo team wanted consistent parse behavior. I offered to take over
toml with goal of migrating
toml fills a similar role as
serde_json, providing serde support for the
TOML format and a default data structure to deserialize
toml_edit is more complex and slower because it needs to preserve all
end-user formatting from parse to display. As part of the cargo work, we got
toml_edit a lot closer to
toml in performance
and offered the
easy module as a
toml compatibility layer.
In theory, we could just migrate everyone from
toml_edit and be
In practice, there is no way to help people through a crate rename.
structopt being absorbed into
clap in 2021, we are still seeing people
structopt unaware of the change over. Additionally, there was interest
in a more stable API than what
toml_edit offers as we had a lot of churn due
to the low level details in the API and as we figure out how best to allow
editing of TOML documents.
So keeping the
toml crate around was worthwhile and we could lighten the
community's overall maintenance burden by combining efforts and code.
toml now passes all compliance tests for TOML 1.0, including
- Not erroring when a table appends to dotted keys
- No more stray
,when writing arrays of tables
- No more
ValueAfterTableerrors when writing top-level key-value pairs, requiring users to opt-in to a fix
Error information also improved, most notably the error messages are changing from the old
invalid type: string "a", expected isize\nin `foo.bar`
TOML parse error at line 2, column 7
2 | bar = "a"
invalid type: string "a", expected isize
Callers can also render the errors as they wish, like with ariadne. We also improved the quality of the span information being reported.
toml_edit also helped highlight some issues with
allow you to
Display anything. If it looked like a document, it would be
rendered as such. Otherwise, it would be rendered as a value. This dynamic API
makes it easy to get things wrong. Instead,
TOML value while
Display a TOML document. Similar for
parse. A concrete example of what this allows is for
["a", "b"] as either document or a value.
Users should also expect maintenance going forward to improve as the code base
is easier to support and not just because of the two-for-one maintenance.
toml had a handwritten parser that had to deal the non-linear nature
of TOML. Now,
toml parses everything to an AST and then deserialzies to the
end-users data types. Separating the steps of parsing and deserialize
simplifies them, making it easier to confidently make changes. The parser is
also easier to update as it is higher level, using a parser combinator crate.
In rough terms, we expect compiles to be slightly slower as more code across
more dependencies is being built. Parse time is also about twice what it
was before (62us to 110us for cargo's
Cargo.toml on my machine). We do have
ideas on how to further improve parse times. We do not track
for TOML, assuming it isn't in a critical path.
As already mentioned,
toml_edit users now have a more stable subset of the API that shares behavior and compile-time.
Otherwise, the biggest gain is span support. We now track the location of each
Item within the original document while parsing. This allows you to
to capture that location. Deserialize errors will now look more like parse
errors, showing the error location, and allow you to lookup the span
Unfortunately, span information is only exposed through serde and errors at
this time. To maintain performance,
we only capture spans while parsing rather than Strings for all of the
format-preserving information, avoiding allocations for serde support.
Document::from_str has to replace those spans with strings to allow editing.
We don't keep the spans around for the editing API to keep the size of each
Item in memory smaller. The spans also present a lot of challenges in an
editing API as we can't guarantee what source they are associated with.
Also, a lot of small pieces of polish were found through
tomls tests, e.g.
inconsistent casing and newlines in errors.
We did speed up
toml_edit::de, with parsing cargo's
Cargo.toml going from
115us to 111us. However,
Document::from_str slowed down, going from 85us to
103us. We have hopes to recover the performance loss.
Long term, the
toml_edit::easy API is going to go away in favor of