We have just released the very first version of unstruct, v0.1.3. Unstruct is an Open Source program that parses simple XML files into text files, suitable for bulk inserts into a relational database. It is written in Rust and the goal is to be more performant than loading XML into the database and doing the parsing there. As an example, on a recent MacBook Pro unstruct is capable of parsing 10 000 CDR (call detail record) XML files per second.
The release notes are as follows:
This is the very first release of unstruct. Expect bugs. Read the code in place of missing docs.
The code and binaries can be found on GitHub:
https://github.com/Roenbaeck/unstruct
Feel free to fork and help out! We need help:
- Testing different XML files (very early stages of development).
- Making it more robust (there’s practically no error handling).
- Improving it in terms of performance and functionality.