Data Manipulation in Clojure Compared to R and Python (codewithkira.com)

57 points by tosh 2 days ago

13 comments:

by ertucetin 2 hours ago

I’ve built many different kinds of software (backend, frontend, 3D games, cli tools, code editor, and more) with Clojure and have been using it for over a decade now.

I can confidently say that, among the list I mentioned, it’s the best for data manipulation/transformation. Thanks to the author for presenting it clearly and showing how the libraries and code look across different languages, all of which do a great job.

But Clojure has its own special place (maybe in my heart as well :). I think Clojure should be used more in the data science space. Thanks to the JVM, it can be very performant (I’m looking at you, Python).

by olivia-banks 33 minutes ago

Having "NA" being treated as nil/null/None by default seems like it would cause the Namibia problem!

by __mharrison__ 2 hours ago

Good pandas and polars code should also be written in an immutable way...

by epgui 2 hours ago

Good python code can exist, but python makes it so easy to write bad code that good python rarely exists.

by nxpnsv 2 hours ago

Agree. While it is common to see code like these pandas examples, it is very possible to write these manipulations so that they return a new frame or view without changing the inputs.

by soumyaskartha 2 hours ago

Clojure never got the data science crowd even though the language is genuinely good for it. Always felt like a distribution problem more than a technical one.

by asa400 2 hours ago

Unfortunately, having to mess around with a JVM is a tough sell for a lot of data analysis folks. I'm not saying it's rational or right, but a lot of people hear "JVM" and they go "no thank you". Personally I think it's a non-issue, but you have to meet people where they are.

by packetlost 2 minutes ago

idk, I don't think I've had to do anything beyond install the JVM to work with Clojure. I'm not really a fan of the clj commands flag choices though (-M, -X, etc. all make no sense)

by cmiles74 8 minutes ago

I dunno, if you can slog through the Python ecosystem then the JVM is starting to look not so bad. Plus with Clojure you don't need to deal with the headache and heartache that is Maven.

by pjmlp 35 minutes ago

The irony given the mess of Python setup where there are companies whose business is to solve Python tooling.

by famicom0 an hour ago

Meanwhile, I find it very annoying to deal with the litany of Python versions and the distinction between global packages and user packages, and needing to manage virtual environments just to run scripts. That being said, I am not an expert but that's always been my experience when I need to do anything Python related.

by levocardia an hour ago

In this very post you can see why: the dplyr code is just so much more readable. Like a lot of python, dplyr reads almost like pseudocode: take this dataset, select the columns that start with "bill", then filter so that bill_length is less than 30. So simple and so little fluff!

by erichocean an hour ago

> is just so much more readable

I thought that too before I learned Clojure, now I find them equally readable.

Data from: Hacker News, provided by Hacker News (unofficial) API