Earlier this year I enrolled in a master's programme[1] at UNIST and joined the Programming Languages and Software Engineering lab (PLaSE) as a student researcher. The stipend covers the school fees and living expenses, and I'm given an academic freedom to choose what to work on and take risks. I will review the life here in detail in another post, but (SPOILER ALERT!) overall I'm quite content with it.
That being said, PLaSE is new and small, we only do research on software engineering and don't do its name justice. Because of that, in the first year here I decided to do each day of Advent of Code[2] in a language I'd never used in competitive programming (CP) before.
Here was my blacklist going in, chronologically: Pascal, Python, Scheme, C, C++, Common Lisp, Lua, Raku, Go, Rust and Zig. I am only proficient in over half of the listed languages, but dura lex, sed lex, I'd already had my CP first time with the rest.
To try any new language, all I have to do is dropping into an ephemeral shell with its implementation using nix-shell
or guix shell
without the fear of bloating up my systems. I'm running NixOS on my laptop with nixpkgs being one of the largest downstream repositories, including everything but the kitchen sink. On the work desktop, I installed Guix System which has a decent set of packages and nix service in case something is missing. Every update, I run garbage collection and get rid of all unnecessary software, i.e. those not declared in my config.
The first day should have been the warm up so I challenged myself with using POSIX utilities. This is a bit irony though as the majority of my time spent outside of the editor or a web browser is inside a (Bourn-again) shell.
The problem was indeed simple, involving only finding the maxima among the sums of newline-separated numbers. I used sed(1p) to turn the input into dc(1) eypressions, and sort(1p) and tail(1p) for picking the largest sum. Probably the most interesting part was that the summation was reusable to grade an assignment for a course I was a teaching assistant for.
The second problem didn't ramp up much in difficulty. It only called for some rather simple arithmetic, and the input format's regularity convinced me to finally give Hare a try.
For just a taste, Hare is boring in a good way. I was excited for the tagged union of error which can include and propagate any debugging information, but unfortunately it wasn't needed for programs of such complexity (nor that errors are ever handled in CP). I'm looking forward to a chance to write more Hare in the future.
The task for day 3 was literally day in scope. I went for another better C that is Nim. My first impression with it wasn't positive: Nim insists on considering each source file as a module and does not allow hyphens in identifier name, so filenames mustn't have any hyphen either. This had led me to piping the source code to nim c -
and executing ~/.cache/nim/stdinfile_d/stdinfile
to keep my solution naming convention. nim r -
wouldn't have worked either since the convention also consists of reading the input from stdin.
On the bright side, uniform function call syntax, identifier case-(and underscore-)insensitivity and optional parentheses allowed me to dodge parentheses in calls and camelCasing altogether. Although I love Lisp and don't have any problem with brackets, I think their placement in ALGOL style hurts the readability of nested calls and camelCase is just objectively bad, pun[3] unintended.
The forth problem wasn't any harder, only requiring simple logic operations and summation. To save time, I opted for Julia, which I was kinda sorta familiar with in building this site (at the time this is published at least). Like Nim, it has higher-order functions and a (reference) compiler capable of producing fast binaries.
The next day's task was finally a breath of fresh air with matrix parsing and LIFO (literal) stacks. It begged for a regular expression parser,[4] hence I mined a tiny bit of Ruby for the task. Ruby had been designed to be an object-oriented Perl, and expectedly it feels very similar to Raku. To an extend, I was also able to avoid ALGOL-style call do quite a Lisp impression.
When I was looking for a second language to learn after the peak of my CP career in middle school, I was choosing between those with garbage-collection that are most popularly used in free software at the time, namely Perl, Python and Ruby. Perl was ruled out due to my fear of sigils and I picked up Python as I didn't want to be a weeaboo. Sometimes I wonder how my side projects would have turned out to be had I chosen differently.
The sixth problem essentially asked for maintaining a finite queue of English letters until it is distinct. The most efficient way to do this is employing bit shifting for the FIFO and a bit set for the letters. I implemented that literally in the Better C subset of D.
Although the language is around my age and influenced the big names like modern C++, Swift and Zig,[5] its documentation is pretty underwhelming and inconsistent. For instance, the 128-bit integer type cent
is documented as a basic data type, however it only exists in the core.int128 library with more cumbersome usage (and doesn't work with dmd -betterC
).
Like with Nim, D compilers also don't allow hyphens in source filenames, so I had to pipe the code to dmd -of=a.out -
(the executable name would be randomized otherwise).
On the first Wednesday of the month of celebration, the problem was parsing cd
and ls
-like invocation and output to reconstruct a directory tree and do, uh, tree stuff. Janet's PEG module was much more delightful for parsing than regular expression on steroids like Raku's grammar.
Writing imperative S-expressions felt dirty, though it's IMHO a quite better take than Lua, understandably as it was originally a redesign of Fennel.
The eighth problem could be efficiently solved via dynamic programming on multidimensional arrays so I used Fortran for array programming. There's not much to say other than that it werkt and, ah yea, dynamic allocation didn't seem worth the effort so I ended up hardcoding the sizes.
The ninth task was about sparse matrix transformation. Naturally I used hash table in Tcl for this purpose and the solution was straightforward enough. I am planning on extending a video game's level configuration to be programmable and the top contenders are now Lua/Fennel, Janet and Tcl. No idea when I'll get to it, but I'mma keep ya posted.
On day 10, I needed to build a less-than-basic[6] calculator. I thought using AWK would spice things up a bit, but it actually simplified the solution. Instead of having to read and parse each operation, the script is executed for each input line, even allowing interleaving matching. Therefore, the behavior specification could be followed closely without any significant effort on adapting the logic for the language.
I used to think of AWK as just a more verbose sed(1). I was wrong and am glad that I was. I guess AWK can come in pretty handy for similar real-world usages, such as log processing or moderately complex transformation of textual data.
Oops!… I did it again.[2] If you thought because I published this right after Christmas it must be a complete advent journal, I have played you for absolute fools! The later problems were increasingly parsing heavy, and while I still had languages I wanted to try, none left was designed for text processing. I was also busy in meatspace at the time thus I couldn't find the time to write byte-level parsers in languages I didn't know.
I didn't try really hard nor got really far, but in the end maybe the real treasure was the experiences I had along the way. I suppose the contact hypothesis might be true, at least in this context; my prejudice against many languages had been cleared away even after surface-level interactions. You should probably also give it a try, who knows, it could be much gayer than you'd expect!
[1] | No, I have not been given any slave. |
[2] | I know, last year I already quit citing how janky later problems were. |
[3] | camelCase was popularized by mainstream object oriented languages. |
[4] | Not really, reading byte-by-byte would also work, just less cool. |
[5] | I feel underachieved now. |
[6] | No eggs were harmed in the making of the solution. |
Follow the anchor in an author's name to reply. Please read the rules before commenting.