Category: Nushell

Been a while since the last update, I’ve been busy with the new job. Things have been going well!

I went to New York in March for my first week at Datadog, which was a good way to start the job; I met other new joiners, met people on my team, saw a bit of the city. And the view from the office is alright:

Datadog operates at a much larger scale than any company I’ve ever worked at, and that has some upsides and downsides (but mostly upsides). I feel lucky to work with a lot of smart, enthusiastic people.

Last week I was in New York again, for Datadog’s annual DASH conference. I was helping run a booth for my team and it was good to talk to users (and potential users) in person. I got to see a bit more of the city outside of work; the Intrepid Museum was a highlight (an aircraft carrier! a Space Shuttle! a submarine!).

Outside of work, I’ve been spending a little bit of time on Nushell (but not as much as I would like). I’ve been driving some changes to the explore interactive pager (which reminds me, I need to update that documentation). I’m trying to get it to a point where I’m happy with it for version 1.0; I’m not quite there yet but I’ve made a lot of changes under the hood.

Nonfiction I read recently

What Goes Around Comes Around… And Around… (⭐⭐⭐⭐) Michael Stonebraker’s back for another opinionated overview of databases, this time with Andy Pavlo. The whole thing is good but I particularly like their take on vector databases:

They are single-purpose DBMSs with indexes to accelerate nearest-neighbor search. RM DBMSs should soon provide native support for these data structures and search methods using their extendable type system that will render such specialized databases unnecessary

On a related note, I like Simon Willison’s point that maybe you don’t need vectors for RAG:

The more time I spend with this RAG pattern (ed: one using full-text search) the more I like it. It’s considerably easier to reason about than RAG using vector search based on embeddings, and can provide high quality results with a relatively simple implementation.

Fiction I read recently

Lonesome Dove (⭐⭐⭐⭐⭐) This was described to me as “the Western novel to read if you don’t normally read Westerns” and yeah, it was great.

Glorious Exploits (⭐⭐⭐⭐) Funny + touching story about 2 unemployed potters in 412 BC who decide to put on a play with imprisoned Athenian soldiers as the cast.

Microserfs (⭐⭐⭐⭐) Strange that I hadn’t read this before, but Douglas Coupland has a not-entirely-positive reputation in his hometown. He’s kinda known as the guy who got too many undeserved public art commissions around here. Anyway! It was a really fun read and it felt like it could have been written yesterday (surprising for a book about the tech industry written nearly 30 years ago).

A bunch of Horatio Hornblower and Richard Bolitho books (⭐⭐⭐) I was looking for something along the same lines as the excellent Aubrey-Maturin series. These weren’t quite it. Hornblower isn’t very fun as a protagonist and Bolitho is a boring one, I couldn’t make it very far into either series.

Why Nushell?

We can do better than POSIX

I work on a command-line shell called Nushell (Nu for short) with some internet friends, and I think it’s pretty cool. To convince you that it’s cool (or at least worth a try), here’s a whirlwind tour.

Basic Querying

Let’s start by taking a look around the filesystem. We’ll use ls to take a look at the files in /usr/bin:

Recent Nushell/Rust Work

SQLite, file watcher, windows-rs

I’ve joined the Nushell core team. This doesn’t really change what I’m doing day-to-day, but it makes my work on Nu feel a little more official 🙂.

SQLite Support

This is the biggest feature I’ve implemented so far:

I’m pretty proud of how this turned out; it’s very convenient to be able to browse SQLite databases in your shell and interact with them the same way you would any other data source. Nu is often-but-not-always smart enough to avoid unnecessary work when loading things from the database; there’s still some work to do here and it will probably involve rearchitecting how Nushell queries data.

File watcher

I also implemented a watch command that runs arbitrary Nu code in response to file changes. Nothing groundbreaking, but I find myself needing this kind of low-key automation all the time: run tests when code changes, restart a development server, log changes in a directory, etc. I think the ability to respond to file changes should be a more widely available primitive, and now it is.

Rust for Windows

Against all odds, I somehow got sucked back into Windows development. I spent a solid week helping one of Nushell’s dependencies do a big upgrade of their Windows functionality. This required a deep dive into the current state of calling Windows APIs from Rust, and… it’s a mixed bag.

I used the windows crate which is maintained by Microsoft. It’s an automatically generated set of Rust bindings for Windows APIs, which is both good (very comprehensive, always kept up to date) and bad (some rough edges that might be solved in a handmade solution like winapi). The crate is actively being worked on and it frequently has breaking changes; this means documentation is a little scarce and often out of date. Overall I was impressed and I think the crate has a bright future . But until it settles down a bit, expect some growing pains.

I’ve been using Rust full time for the last month and a bit while contributing to Nushell (more on that later). A lot has changed since I first tried Rust in 2019 and this is my first time working on a big Rust project. Here are some thoughts on the language while they’re still fresh in my head.

Compile times and feedback loops

Rust’s compile times are notoriously slow. Rust development was slow enough on my laptop that I finally gave up on mobile computing and bought a desktop with a top-of-the line CPU (12900K). Along the way I switched from Windows to Linux (more on that later) and started using the mold linker, and now… things are OK!

I’m able to do incremental builds of Nushell (a huge project) in a second or 2, and a full debug build takes 25s. For smaller projects, incremental builds are nearly instant. There’s certainly room to improve here, and the development experience is not great on average hardware, but… this works for me.

Another thing to consider is that the typical Rust feedback loop is tighter than you might expect from the slow compile times. The Rust compiler catches a lot of bugs before a full build needs to happen, and that reduces the need to do a full build and try things out.

Complexity + monotony

Rust is not a simple language. In total I’ve spent nearly 3 months working mostly in Rust, and the language still has a lot of corners that I don’t have a solid grasp on. To improve on this I’m going to need to branch out from Nushell and write a lot of little tools for myself.

Despite the complexity, I’ve found that writing Rust is sometimes a bit… braindead? The type system is very expressive and the compiler catches a ton of errors, so I spend 25% of my time thinking real hard and 75% painting by numbers to make the compiler happy. I can’t quite decide how I feel about this style of development, it can be a little tedious but it also makes for a better end product.

I (sometimes) want a higher-level Rust

Rust has a lot of great things going for it; the tooling, community, package ecosystem, compiler, and syntax are all fantastic. But the focus on systems programming does mean that Rust isn’t quite as ergonomic as it could be for many use cases.

Sometimes I just want a garbage collector! Sometimes I’d be perfectly happy for Rust to implicitly allocate memory if it makes my code work (for example: converting from a &str to a String)! I don’t know if that will ever be possible in standard Rust, but… maybe there’s room for a Rust variant intended for higher-level use cases.

On the other hand, the ability to go as low as you want is great. It’s nice to work in a language with a very “high ceiling”; no matter where your Rust project goes, you won’t have to switch to C or C++.

I recently spent a few days tuning Nushell’s GitHub Actions CI pipelines and it paid off: CI used to take about 30 minutes, and now it’s closer to 10. This is not pleasant or glamorous work, but it has a big payoff; every Nu change going forward will spend a lot less time waiting for essential feedback. Here’s how you can do the same.

Use rust-cache

Seriously, it’s really good! GitHub build runners are slow. But GitHub gives every repo 10GB of cache space, and rust-cache takes advantage of that. It caches temporary files for your build dependencies across CI runs, so if you have a lot of dependencies you’ll likely see a big performance boost.

One gotcha to be aware of: GitHub Actions has slightly unintuitive behavior across PRs. PR X is unable to see cache data from PR Y, but they can both see cache data from the base branch (usually main or master). This makes sense from an isolation perspective, but it’s not especially well-documented; I ended up adding an extra CI trigger on main just to fill caches properly.

Split your build and test jobs

Previously we were running cargo build then cargo test in a single job. This was suboptimal for a few reasons:

  1. cargo test often needed to recompile crates that had just been built for cargo build. #[cfg(test)] is the most likely culprit here; it makes sense that build output might be different in “test mode”. This has implications for caching too!
  2. It’s faster to run build and test in parallel; GitHub gives us 20 build runners for free, and we might as well use them.

Run Clippy after cargo build

Previously we were running Clippy before cargo build. Just switching their order shaved about 5 minutes off every test run! It seems like Clippy can reuse build artifacts from cargo build, but not vice versa.

(Dec 2024: I’ve been told that this doesn’t work anymore. Possible that something’s changed in Cargo/Rust)

Use cargo nextest

cargo nextest is “a next-generation test runner for Rust projects.” It’s dead simple to install in CI, and it’s often faster than cargo test. We didn’t see a huge benefit from this (maybe 30-40s faster?), but that’s because our CI time is dominated by compilation; YMMV depending on your code base and test suite.

Conclusion

If you’d like to see the actual changes, they’re all here. Like anything GitHub Actions, this took a lot of tries to get right; those 5 PRs are just the tip of the iceberg, there were a lot more experimental changes in my private fork. I’m hopeful that someday we’ll be able to stop programming in YAML files, but we’re not there yet!

headshot

Cities & Code

Top Categories

View all categories