Delta Lake in .NET - Start of a journey
This is the start of my journey to implement Delta Lake in pure .NET. I have created a github repo and reserved package name DeltaIO because others were taken already. What I did so far is a pretty good understanding of the delta protocol myself, and what I’m trying to do is implement reading logic for tables in S3.
I have found alternative implementations
-
delta-net - an early prototype as of late 2024, although the author is probably thinking of giving up:
which I think he shouldn’t.
-
delta-dotnet - as of end of 2024 a promising incubator project, which is essentially a wrapper around delta-rs. I think this might have a future. I am using delta-rs from python at the moment, however it doesn’t feel natural. Maybe because Rust people are writing Python logic, or maybe because Python is an afterthought. I was looking at the readme, and the .NET wrapper doesn’t feel very ergonomic to me at the moment.
What I’d like to create is a native feel, pure .NET implementation, based on another library I was working on - Parquet.Net - Delta heavily relies on Parquet.
What can it do today
I’m proud to say I can read both simple and partitioned tables at the moment, but of course there are bugs and it’s just a start.
Thanks to using Stowage, I can read delta tables from S3, Azure and so on, and this can be extended as well. Stowage is extremely lightweight and awesome cloud files library for .NET.
There is still no support for things like time travel, i.e. I can read only the latest version.
So, stay tuned, subscribe to this blog, star the project, donate, it all helps.
To contact me, send an email anytime or leave a comment below.