How an attempt to solve a clever programming exercise led to a new patch to the R language itself, a review of the enlightening results for the recent data.table community survey, and creating a Doom map in R, because why not?
Episode Links
- This week's curator: Eric Nantz - @theRcast (Twitter) & @[email protected] (Mastodon)
- I Patched R to Solve an Exercism Problem
- {data.table} Community Survey: Results and insights
- Doom plots
- Entire issue available at rweekly.org/2024-W10
Supplement Resources
Supporting the show
- Use the contact page at https://rweekly.fireside.fm/contact to send us your feedback
- R-Weekly Highlights on the Podcastindex.org - You can send a boost into the show directly in the Podcast Index. First, top-up with Alby, and then head over to the R-Weekly Highlights podcast entry on the index.
- A new way to think about value: https://value4value.info
- Get in touch with us on social media
- Eric Nantz: @theRcast (Twitter) and @[email protected] (Mastodon)
- Mike Thomas: @mike_ketchbrook (Twitter) and @[email protected] (Mastodon)
Music credits powered by OCRemix
- Bonus Bop - Donkey Kong Country 2: Serious Monkey Business - Xenon Odyssey, The UArts "Z" Big Band - https://dkc2.ocremix.org/
- Hangarmageddon - Doom Dark Side of the Phobos - EvilHorde - https://ocremix.org/album/4/doom-the-dark-side-of-phobos
[00:00:03]
Eric Nantz:
Hello, friends. We are back with episode 155 of the R weekly highlights podcast. This is the weekly show where we showcase the awesome resources that are available every single week on this week's our weekly issue. My name is Eric Nantz. And as always, I'm delighted that you joined us from wherever we are around the world. And, yes, spring is in the air around here, and I'm feeling happy as always to be joined at the hip by my awesome cohost, Mike Thomas. Mike, how are you doing this morning?
[00:00:30] Mike Thomas:
Doing well, Eric. Yep. Spring is in the air here in the the East Coast as well. Trying to start planning some travel and some conferences and looking forward to, getting back and and seeing some folks that I will not have seen in a year here coming up, in the next couple of months. Summer feels like it's not that far away.
[00:00:48] Eric Nantz:
That's right. And this is a little bit closer to my, sports exploits. We're getting closer to hockey playoff season, and the nerves are starting to happen for my beloved bread wings to try and squeeze in a wild card slot. And it won't be easy, but we're we're getting the vibes. We're getting the positive vibes here. We'll find out.
[00:01:06] Mike Thomas:
Yes.
[00:01:07] Eric Nantz:
But as always, I just want you all about to talk about hockey stuff, but we're gonna talk about our weekly here and the awesome resources that we mentioned in this week's current issue. And, let's check the notes here. Oh, oh, oh, yep. That was me curating this week. In between random visits to, like, school libraries and swim meets. I somehow curated this issue, but I think we got a good one to talk about here. And I never am able to do this alone whenever it's my turn because I have tremendous help, as always, from our fellow r Wiki team members and contributors with your, I believe, 8 or so poll requests to this issue, which is very welcome. Awesome addition, indeed.
And I thank all of you that have been contributing to rweekly. So without further ado, we're gonna dive right into it, Mike, and I think you're gonna lead us off with a really fun exploration that has a lot of twists and turns that eventually involve patching the r language itself.
[00:02:04] Mike Thomas:
Yes. This is a blog post from Jonathan Carroll titled, I patched r to solve an exorcism problem. I didn't know where exorcism was going. I wasn't sure if we were getting into to religion or what sort of road we were going down here, but there is a website called exorcism.org that has, a lot of different challenges across many different programming languages that allow you to to try out a different programming language each languages each month and try to solve some sort of non trivial problems, you know, not just printing hello world in that language, but actually actually trying to solve a fun little toy, exercise. Sort of reminds me of Advent of Code, you know, on the on this particular website and, you know, Jonathan shows how he has been doing these exercises across a multitude of languages including Haskell, Go, Julia, Python, JavaScript, Scholar, Rust, Fortran, Lua.
Pretty pretty incredible work that he's done. Here he's gotten quite a few badges on exorcism based upon some of these challenges. And one of the the recent challenges was to write an algorithm that converts, integers into roman numerals. And probably in a lot of languages, this is something that's tricky. But for those who know or for those who may not know, there is a function in base r called as.roman in the standard base r library and it allows you to just provide that function with an integer and it will return what I thought was a string. I guess it's of class Roman, which is quite interesting.
But it'll provide you with the the Roman numeral equivalent. And that's pretty incredible. So I think Jonathan thought, you know, at this point, he's done. It was almost like, you know, a little cheat that he has in the R language to be able to very easily solve this problem and, it's a pretty short algorithm when it's just a single function that's already been implemented. And one of the wild things about, R's ability to work with roman numerals as well is you could assign, you know, the output of this as Roman function to a variable, and then you could do that again with a different integer that you're converting to a Roman, a Roman numeral.
And you can do math with those 2 different objects that are both these these roman, numeral objects. You can add them together. You can multiply them. It's it's pretty it's pretty incredible. I'm not sure how useful this is on a on a day to day basis. It's something I've never, I guess, had a a use case for, but I'm sure there's folks out there, you know, that that had a particular use case where it made sense to not only, you know, provide roman numerals to whatever that end output deliverable is, or maybe to even do some math on multiple roman numeral objects. So I I guess a pretty cool the more you know type thing with with base r.
So, you know, Jonathan realized, unfortunately, as he began to run some some tests to try to convert, numbers, I believe, 1 through 3,999 to Roman numerals, that one of the that one of the tests was failing. And, there was a mismatch between what he he was expecting and what the as Roman function returned because, the last a 100 integers from 3,899 to 3,999 returned NA values. And this was a little confusing. I guess, in a lot of other languages, they sort of state that, and this might be in R as well, I believe, that, any of their Roman numeral conversion algorithms really go up to 3,999.
That's sort of the the final integer value, that we have Roman numerals for. So Jonathan was sort of expecting, the the limit here to be 3,999, not 3,899. So we had to dive into the source code and and this is sort of where it goes from, you know, oh, I have this, problem on exorcism.org that are already has a nice little base function for As Roman. I got a one liner. I'm gonna get this this new badge and it it quickly cascades into, you know, in the spirit of yak shaving, quickly cascades into oh my goodness. Now I have to dive into this and it looks like I'm gonna need to submit a patch to r itself and becomes something much bigger, than maybe he initially set out for. So, Eric, do you wanna take it away with, the the patching to r?
[00:06:52] Eric Nantz:
Absolutely. And, boy, do I feel seen with the yak shaving analogy here because I literally have been going through this on an internal package at the day job where I just wanted to beef up the test suite a little bit. And, boy, did I know now I'm solely in the internals of Unix batch processes along the way. So Network tests. Yeah. Unit tests. Exactly. So, luckily, I knew how to patch it. I knew who it was responsible for it, but this is a little different here because John has indeed discovered that within the source code of r itself that's responsible for these roman numeral conversions, he did indeed see traces of the number not being 3,000999 9, but 3,899 littered throughout the code base.
Now you may ask, how on earth do you actually search the source code for R itself? Well, we are very thankful as a community that there is, on GitHub, a mirror of the r source code. I believe it's actually under Winston Chang's account still. It's called r dash source. You've been here before, Mike. I sure have. This is gonna be the bookmarks for a very, very long time. And in fact, it'll often turn up on Google if you're searching for a package of source code on a GitHub repository. Oftentimes, if the package is already on CRAN, this will be in, like, the top five results, this CRAN mirror of that said package. But, regardless, we're talking about the r source here. So taking advantage of that platform, John did indeed, like I said, search for where this number is actually showing up. And, yeah, it is showing up quite a bit, albeit some of these are what you might call false positives. They're not really having to do with that function itself. But in the typical GREP call, you did indeed find a lot of files in the source r library, both in in r files, c files, and documentation files.
So he did have to do a little more intelligent filtering to figure out just where all this is really taking place, and sure enough, he does eventually find it. And within format call and the like, but then he discovers there is a utility type function with the name roman.r. So very straightforward. And there sure enough, there is a comparison of the range being from 0 or greater than 3,900. There it is. He has found it. And sure enough, now what's the next step? Right? Well, r itself, we mentioned there is a mirror of the source code on GitHub. That's not actually where the upstream code lives for development. It's actually using the subversion repository.
Shout out to all those who use subversion in the past. It's been a while for me, but that is where if you are wanting to learn about contributing to the art project itself, you're gonna have to pull down that SVN mirror to your local machine and then run a patch through SVN. The and there's an SVN diff, patch there, I believe. The command I'm rusty with my subversion coding here. But, when John reached out to the maintainers, on the mailing list, for r, they did recommend, hey. You know what? It looks like you're on to something. Please submit a patch and file a Bugzilla report. Just like we talk about for contrary to open source in general, finding the best way to reach a project and making sure that issue is tracked and then there's actionable feedback on that, that's the way to go. Right? So John is following the protocols that have been established by the our project team to submit this report.
And then now comes a part well, he submitted it. Now you wait. Is it gonna get merged in? Sure enough. It does get merged in. This is exciting stuff here. Right? John has literally patched the R language itself for this issue. Now as you think about, well, will this really work? What's a great way to test if your patch is gonna work? Well, guess what? Comes containers again. John discovered, you know, what I've been using for years now and, you know, that our community has been using for years is being able to bootstrap particular R versions with Docker and particular the Rocker project to be able to check if this patch is indeed going to work on the upstream version of R that's coming from the bleeding edge of the subversion repository.
He was able to pull that down into a container and then verify that his patch actually indeed works. So what's next? Well, obviously, when the next version of point release of r is released, this patch will be included in it. When that happens, it'll probably be later this year. But this process this blog post illustrates such a unique story here in terms of the nature of open source and the fact that one little learning exercise turned into patching the language itself. But John concludes the post with some really great advice if you find yourself in a similar situation in the future, whether it's in our package or another language entirely.
First, don't always assume that the language itself is broken. Of course, you want to check that you haven't misspecified some. So read the documentation, run some additional tests. That's always helpful. And then when you do think you've pinpointed something, guess what? Nature of open source, go into the source code itself. And, yes, we have learned that, yes, even with the R source code, the base R source code, there are ways to grep that both on the GitHub repo and also, through Linux utilities like grep and the like. So having a good knowledge of that is extremely helpful for some of these niche bug bugs like this. And then don't wait to communicate.
Again, John reached out to the mailing list, put out what he was finding in his explorations, got a response from the maintainer, able to get direction on how to proceed next without, you know, going too far without that buy in. And that can happen sometimes. Some people can submit patches without checking with a maintainer first, and then there might be a little disagreement or maybe other work that wasn't merged in earlier. Always communicate early. Nothing bad can happen. My opinion from communicating early on this. And then, yeah, if you find this issue, of course, if you have the capacity, it's excellent. If you can ease the burden of the maintainers to fix the patch yourself, sometimes you might need a little help. And, again, don't hesitate to ask. Maybe a code review, maybe another test case that you like someone to assist them with.
So I think this post is a terrific story of how to go about this process. And, yeah, don't be afraid of communicating with the R team on these issues because guess what? Like anything open source, it's not like they're gonna be able to catch everything themselves. And sure enough, this this hard coded limit went through r for years years years for roman numeral conversion without somebody really discovering it. So better late than never. Right? But with open source, you can, you know, do your part as a user and as a contributor to make that fixed and benefit everyone else in the process. So, again, if nothing else, also count John's blog post because there is some gratuitous, very fun Simpsons imagery too that always warms my, retro viewing hearts.
[00:14:47] Mike Thomas:
Yes. No. I I really appreciated sort of those last points that Jonathan made to to talk about, you know, how he was able to succeed and and maybe those those 4 different things that he recommends. You may consider if you find yourself in the same situation and it's a pretty empowering thing, right, to be able to because we live in open source world to be able to, you know, contribute and submit a patch to the the our, language itself that you know, you know, thousands, millions of people are going to to use and be affected by. That's that's pretty incredible. And I think Jonathan's put together a pretty nice road map here to help you do that if you find yourself in a similar situation. I think you you may need to to turn back time to, you know, about 2,005 to use SVN in a mailing list to do so. But, we we gotta use the tools tools that we have, and, that that's just a that's just teasing.
[00:15:42] Eric Nantz:
I I would say sometimes it can be intimidating to figure out, okay, just how deep does this rabbit hole go. But sometimes with a little perseverance, it does indeed pay off. This was really, really interesting interesting exercise. And and you know what? I'm gonna bookmark that exorcism site. That that is some really top notch ways to hone your programming craft. So nice find there as well. I agree.
[00:16:20] Mike Thomas:
Eric, you know what else is interesting? The results of the 2023 data dot table survey.
[00:16:26] Eric Nantz:
Oh, yes. And this is a good callback to just a few weeks ago. We were mentioning how the data dot table project was indeed, you know, revamping some of its governance and making easier and more transparent for ongoing road map ideas and how users can contribute. So, of course, what's the best way to hear how users are receiving your package and wanting either suggestions for improvement or what? And that is to release a survey earlier in the year. And this blog post is coming from the data. Table blog and, in particular, the author, Alja Sluga. And he starts off with, first of all, thanking everybody that has filled out this survey, and they got almost 400 responses, which is really nice for a survey like this. And we'll walk through a couple of the key findings here and where it might relate to the data dot table project in the future.
There is the post leading off a little bit of demographic style information showing that, you know, the majority of users did have an a very much an experience set using r for 7 plus years and data dot table, you know, quite a bit in that time frame as well. And then, you know, many are using it every day that responded to the survey, so there might be a little bit of selection bias here going on. But, hey, it's always good to quantify that information. And then he gets into some of the, you know, the the tangible feedback itself. And there were very specific questions, but there was very a very obvious kind of trend that came in terms of what users appreciate the most about data dot table, and it's something that actually brought me to some use of data dot table in my early days of our programming.
That is performance. It is very memory efficient. If you've been down the road of having that massive CSV or other text file and having the base r, read dot CSV, crash your r session because of memory limits, well, data dot table has always been very efficient in this space. And when people need speed, they turn to data dot table more often than not. And then another positive feature, which ironically has another side to the coin to bear on your perspective, is the syntax of data dot table itself. I think, Mike, you and I agree that it is very unique in the syntax as compared to other frameworks in the R language.
But when you invest in that DSL, if you will, you can accomplish a lot in a pretty concise way. As for me, I'm just not a regular Data. Table user. So I do identify with some of the feedback that we're seeing in this post of those in the community having to look it up most of the time to figure out how to do certain operations. Again, there is some great documentation out there. It's just for me, not muscle memory yet of how to implement the syntax. So, again, it's good to see kind of tangible data showing these different trends across a different spectrum of user bases here.
And, overall, it looks like people are pretty satisfied with the with the package itself. Again, not everything is perfect. Again, performance is becoming one of the most favorable areas. But then you might see some, you know, some not so great issues as well. In terms of desired functionality, there were some feature requests out there. And, Mike, why don't you take us through some of what the users are kinda hoping for in the future in data dot table?
[00:20:03] Mike Thomas:
Yeah. Absolutely. You know, I think one of the the most insightful charts for me in this blog post is this importance versus satisfaction, plot, which is really interesting. And I think just to to highlight and to sort of summarize, the the feedback from the community, you know, the the sort of data point here or feature of data dot table that had the highest importance rated with the highest importance and had the highest satisfaction, was performance. And then, you know, lower on the important side, but high in satisfaction was the the minimal dependencies, which is absolutely a strength of data dot table.
And then higher on the importance, but lower in satisfaction, so I think these are things that, hope, you know, respondents are hoping that data dot table may improve would be, you know, the docs and the legibility of the syntax itself. So in terms of that desired functionality that they're talking about, one would be support for out of memory processing. So I think this is something that, you know, has has come to light, especially with, I believe the arrow package. Does that do out of memory processing? I believe so. Yes. Okay. So that sort of allows you to operate on disk, operate on the file on disk without bringing it all into memory first. You know, folks are looking for richer import and export functionality with parquet sort of being the the most commonly mentioned, item followed by good old xlsx format.
And then, the last We can't escape the spreadsheets, can we? Oh my goodness. And then the last, piece of desired functionality that they have listed here is integration with the pipe operator. You know, which also lined up with, you know, how some of the questions around how, much folks are using the pipe, and I imagine they're talking about mostly the the native pipe here. Most respondents here or or the majority of respondents are responding that, the pipe is is very useful to them and they would find some sort of a convenience function for using Data. Table with the pipe, to be very very helpful.
And then there is this notion of the I don't know how these things get these names, but this is the name that's been around for forever. But the the alias for the walrus operator, which is just a colon followed by an equal sign. And I guess the that sort of lines up with data dot table's mascot. Right? It's a walrus? That's right. Yeah. Let's go synergy there. Yes. So I I think folks who were looking for maybe a, you know, a more plain English, alias for that operator, with some of the options being either set, let, or set j.
And set seem to be, you know, the function name that would provide an alias for that Walrus operator to be the most, popular response there. And then sort of the final chart in the way that, this blog post starts to wrap up is on the topic of actually contributing to Data. Table and to gauge folks' interest to actually contributing to the project or their contributions in the past. I guess not surprisingly, you know, spreading the word about data dot table and and just reporting issues were sort of the the top two responses in terms of what folks, would be interested in and then what maybe they have the capacity to do, you know, followed by actually contributing to the code base itself. So, you know, some users, I guess, in conclusion, are are are a little worried that the package may be abandoned or stagnating.
One thing that I would wanna say that, you know, I've seen on social media before is, like, you know, this is the next iteration of Language Wars. It's now, like, oh, dplyr or Data. Table and, you know, you have to be in one camp or the other. And if you're in one camp, you have to not like the people in the other and vice versa. And I think that's that's absolutely ridiculous. I hope that that doesn't really exist. And I would say that, you know, like anything else, it's amazing to have options and use the tool that that fits your use case and fits your comfort the best. You know, data dot table is fantastic if you wanna use it. If you wanna use dplyr and and Arrow or, you know, DuckDV, you know, you can use that too. So, I I think as long as the community continues to to rally around the package, and I think initiatives like this one here to try to get feedback and to understand how it can be improved will go a long way towards the longevity of data. Table as well.
And I know that they have done a lot of work on this package around documentation and community, just in the last maybe 6 to 12 months. So excited to see these results, and, I I think the community is strong.
[00:24:58] Eric Nantz:
Yeah. And lots of positive momentum, like you said, this year with some of the steps they're taking. And not that it was very whacking in any way, but this is you know, as as open source projects evolve, you do often have, you know, newer contributors or newer users come on board and looking at what are the available options for, say, data processing, data manipulation. And it's always great to have choice in this space. I know sometimes in my industry, there are some people, they get a little, you know, maybe confused about having so many choices in domains. But you know what? For your specific project, if data. Table fits your needs and, boy, I remember many days of importing some huge textual, you know, biomarker data files and data. Table was as fast as could be in that space. And, yeah, we have lots of great code bases that leverage that package heavily. So I'm always of the mindset if it ain't broke, don't fix it. And then, also, with respect to data dot table maintainership, yeah, it is alive and well. They are really spreading the message out for various channels, and this survey should serve as a reassurance to everybody that they are really thinking of the users in mind, both those that have been using data dot table for years upon years and those that are coming new to the project because they are both equally important in the lifespan of this space.
And, certainly, I'm really appreciative of the transparency, and I'd see nothing but great things happening for the project going forward. And the fact that they're sharing this more actively, I think, is a is a huge step, to bringing this, not that it wasn't a first class citizen before, but really putting this into the mind share of most of the R community. I think data dot table, the project itself is doing great things to make that happen.
[00:26:47] Mike Thomas:
No. I I agree as well. Lots of positive momentum, lots to look forward to, and in no way is this project doomed.
[00:27:05] Eric Nantz:
Well, luckily, Mike, we're not doomed in terms of the rest of this episode because we do have some fun things to talk about here, especially on the visualization side of it. But, of course, you listening, maybe you're wondering why the heck are we talking about doom and gloom here? Well, we're not referencing that kind of doom. We're kind of referencing some that did, be a part of my retro gaming heart mech in many, many years ago in my college days, getting together with some friends and playing the heck out of the doom, game by id Software that was often a trendsetter for all these first person perspective games on here. Now just where does this have to do with r itself? Well, our last highlight has kind of done this very interesting geometric type exercise for just how do maps could be created in the context of R itself in the aspect of 3 d style visualizations.
Now Mike and I had to do a bit of detective work on this, but we're pretty certain that this blog post has been authored by, Ivan Krylov. But we admit we could not find any trace of that on the blog post itself. We did some spoofing on their GitHub repo. So, hopefully, we're correct. One way or another, we're gonna go with that for now unless we hear otherwise. But Ivan leads off this post about talking about when would you want to visualize in a 3 d type landscape a function surface. So you may be thinking, if you had experience in this space, kinda like a contour map where you see, like, the elevation in a in a map setting. In fact, it reminded me of a lot of the packages that have been developed such as ray shader and ray render and the like have been doing a lot of those 3 d visualizations in R itself.
And guess what? Base R itself comes with this built in. Especially if you're using the extension packages like Lattice. There is a way to do contour plots in that. The RGL package in the R community helps you do 3 d plots in R. But, you know, we could he he thought we could just do that, but let's let's make this fun. Let's make make a do map out of it. Now I've only seen the end product of a do map, but just what does that really entail? Well, Mike, we're going to geometry school for a little bit on this one, so buckle up here. But, apparently, in the first and second iterations of doom, there was no concept of a floor that could go up a hill or down a hill.
So, apparently, you would have, like, the sky for height, you know, but then you'd have your tiles at a certain level, maybe done at another level in a stepwise fashion. And, of course, R itself in terms of how you would visualize this is not gonna be coming with everything out of the box. So there are some open source utilities, called Zdoom and Zanodrome, which are apparently gonna help with the overall visualization of this before we feed into it in the r itself. But here comes the geometry, school at play here that a do map is gonna have a series of points or vertices, lines, sides, and sectors.
And, yes, there are obviously point coordinates for the vertices of an x and y, and you got lines connecting them. And then you've got the sides that are available to the user when they look left or right. And then, also, there will be textures, but that's not really the point of this post. And then where the actual how the height information is presented, and those are called sectors. So Ivan's original idea was to start with the contour lines package or contour lines function and then tried to kind of makeshift some, you know, artificial slope involved to get to the heights of this. But, apparently, it didn't quite cut it where the editor was trying to fix some things that were missed in the translation.
So he kinda had to go back to the drawing board and start to go with something more universal with respect to doom maps, and that is literally called the universal doom map format, also known as text map, where then it can store the additional information of the heights of these points and not just the x and y coordinates on kinda like the lower plane, if you will. And then it gets to be really math heavy or geometry heavier because, apparently, you need to be able to split these maps of the height into triangular shapes.
We're literally and the blog post has this, a great illustration of splitting a rectangle into 2 triangles of equal area with the vertices, you know, interpolation along the way and then become some clever use within base r of the array function, capturing data frames of these x, y, and now z coordinates that capture the height of the contours of these planes. And then a lot of more manipulation to start to figure out how do we connect all this together. Lots of custom data frames being created here. Lots of other temp files being created here for that mapping utility. And then once he's able to feed in these variables into the mapping software, yes, at the end, you have yourself a doom literal doom screenshot of he fed it into this open source utility that I mentioned earlier.
I think it was more more manual processing of another utility called SLADE. And sure enough, there is a a reproducible rscrubber. If you have that same map emulation software, you do get a shot. The player looking at a contoured hill with looks like from the game itself, like, I wouldn't be able to tell the difference. Like, you're some overworld type area. So I admit I have never thought to try anything like this, but guess what? If you wanna try this out with the right software installed on your system, the our script is downloadable. You can check it out yourself and give it a shot. And, yeah, maybe it's a great way to boost your geometry and mapping skill set at the same time and having some fun along the way. So, hopefully, Ivan, we're getting your name right here, but, thanks for opening our eyes to use of r that I never thought I'd see happen in my lifetime. But guess what? There's nothing that r can't do. Right, Mike? Absolutely. And it's incredible how much of
[00:33:50] Mike Thomas:
what's generated here is from base r's plotting functions as well, and just, you know, sort of vectors and and things like that. If you download this r script, that's linked at the end of this blog post, it's it's fairly concise, I think, you know, what's necessary. He has these 3 different functions, triangulate as text map, and then the final one, image to doom, that spits out a file that I believe you can pass to this software slate or something like that that'll help generate, this exact image that we're seeing on screen. Fairly concise. It's a really cool, you know, I just I'm really enjoying reading the code here.
I learn something new every day. Today, I learned that there's a function in r called is dot unsorted to test if, the vector that you pass to it is sorted in ascending order or not. I'm not sure if I have any use cases for it, but I'm certain that probably sometime in the future, I will. The code comments are are incredible. He has an a beautiful actual diagram in the code comments here, plotting this coordinate map. Just just literally using comments and characters on your keyboard. That's absolutely fantastic, and it lines up with, the diagram that's in the blog post under the triangular sectors, section.
So, you know, really interesting use case. I'd be interested to see sort of how maybe you could take this to the next level with RayShader, and then, you know, maybe make doom look like it's in the, you know, 2023 sort of graphic state. You know, Eric, to be honest, I don't wanna date you, but I'm not familiar with doom. Halo was probably my my first, you know, the foray into first person shooters, if you will, on the old Xbox 1. And, even all the way back then, I think the the graphics were a little bit of a step up than, than than what we have in Doom. But, you know, I'm sure I I'm sure I would have enjoyed Doom if if I had been there.
[00:35:53] Eric Nantz:
Yeah. I think I dare say I would have. And, that if if if that was dating me too much, and I better not mention Wolfenstein because that even predated Doom and that, Id's first entry into the FPS space that kinda changed the world. But, yeah, if you're you talk about going step back in retro graphics. Yeah. That one's a bit hard on the eyes. But, but, yeah, we've actually seen very interesting use cases of games like this where maybe it's not so much the actual end product that you can get, but they are extendable via mods and things like that. And that's where having code like I believe the doom code's in the public domain now. So, like, you could literally browse this yourself, and, hence, you see the modding community go to town on things like this. But but, yeah, I I definitely got the same same vibes as you did, Mike, about how you could combine this with some of the awesome work of, like, Ray Shader and the like to really beef up a a fun demonstration that's built entirely with R itself.
But, yeah, I did take a look at the script. Like you said, that's available for download. Very well commented. And, yeah, easily reproducible with the right software. So I think this is, again, if you thought R wasn't able to do certain things in terms of visualization that combines with retro gaming, well, this post has definitely solved that for you. Yes. I'll have to check out Ivan's previous post because he's definitely got a a great selection of additional topics with respect to, you know, integrations with c, looks like, others on on contributing to r itself. Yeah. Lots of great nuggets here, and, I'll definitely keep this bookmark.
[00:37:33] Mike Thomas:
You know what? You know what I always say? R is the 2nd best language for doing just about anything. Conan said it better myself.
[00:37:40] Eric Nantz:
And maybe And what may be the 2nd best resource for everything in r might be rweekly itself because we have a mix of everything as well from highlights of what we talked about today, our awesome interesting, use cases via blog posts, tutorials, new packages and updated packages, and the like. So we'll take a couple minutes to talk about our additional finds here. And for me, this isn't so much our specific, but we alluded to it earlier, Mike, that it is conference season. It's starting to get underway with various conferences out there. And maybe you, are like me, especially in my earlier days where I would go to these meetups for the first time, I'm a bit of a shy dude, I must say. So, you know, what's the best way to kinda feel comfortable and, you know, ways of connecting with others? Well, my additional find here is from the jumping rivers blog authored by Rhian Davies and Keith Newman called an introvert's guide to networking at a conference.
So this is very nice very nice way to kinda ease that, maybe, that little fear or apprehension you might have at the beginning of these events and how you might navigate certain situations, how to keep contact with people that you do end up networking with, you know, what are some ideas for icebreakers and whatnot, And not to feel too much pressure if you're being sent on behalf of, say, your organization that you're a part of, but really is trying to soak in that experience in an optimal way. So, yeah, I I definitely resonate with a lot of these points here. And, also, I'll mention a a a heads up that we often hear at the various posit conferences is the idea when you're in a group setting, having the Pac Man rule, having, like, an open slot so that people can join your group to to join in on the discussion.
Things like this with practice really do add up and help make you feel a lot more comfortable. So really great post by the jumping rivers blog, and, yeah, it'll be hopefully a Pazitconf would be my next, in person event, and I'll be taking this to heart like always.
[00:39:44] Mike Thomas:
I like that one a lot. Another one that I found was from El Saman on the key advantages of using the key ring package. And the Keyring package allows you to essentially store secrets, that are retrievable, I think, through environment variables would be the most common way to do that. And, you know, one of the differences between using Keyring and and maybe using a dot r environment file that would be, like, project specific is with Keyring, you can store that particular secret once and for all per computer that it's on, which which is nice. You know, you don't necessarily have to do that on a project to project basis.
You also do not have to worry about somebody accidentally, forgetting to git ignore that dotrEnviron file and it making its way up to, GitHub or GitLab or whatever sort of hosting service that you use for your git repository. So that's a nice feature as well, that you may be interested in in leveraging as opposed to sort of doing the, the old hard coded way with, you know, sys dot, get env and, setting environment variables that way. So it might be interesting for some folks who are are looking to, brush up on their best practices around security and environment variables and passwords and secrets and all that stuff.
[00:41:04] Eric Nantz:
Yeah. This is terrific when you're using R and, like, a traditional client kind of setting where you may have a a team using RStudio IDE or whatnot on your local machine. The key ring package is gonna be instrumental to helping, like you said, keep some of those credentials secure and not nag them all the time for it and minimize the potential for leakage. Unfortunately, I don't think this would be a way this would be compatible with, like, a Shiny app that's deployed on a server somewhere, but I will have to look into this a bit more because I know the Keyring itself is used every single day. I go on to my Linux system here at home. I often have to prompt once for my administrative password to do a certain task, but that's being stored in the Keyring credential store and not anywhere else. So lots of lots of ways that I'm sure this could be used, that I'm probably not even aware of. So great find as always.
[00:41:58] Mike Thomas:
And then we have one more that I think we would be remiss not to mention, at least to give a quick shout out to Bruno Rodriguez. We are at part 10 of reproducible data science with Nix, and the discussion here is on contributing to Nix packages. So if you have been following along with Bruno's saga, and crusade on getting folks to check out next next for for doing reproducible data science and having that that fully sort of reproducible environment, that you can come back to, you know, years from now and run your code, and it'll still output that same thing. Check out part 10. It's the latest in the series, and it will not disappoint.
[00:42:38] Eric Nantz:
Yeah. It goes so nice. So we talked about earlier with the idea of patching such an influential project as the R language itself. But guess what? Yeah. NIX, the momentum keeps coming. And, yeah, I was even doing a little poking unrelated to NIX itself when I'm then continuing my efforts with this shiny application as a web assembly bundle for my R Consortium work. And I'm poking around the WebR repo, that George Stagapos has been working on. And I see a commit saying they've made it or I should say it was Shiny Live, Shiny Live for R. I see a commit that they are making things compatible with Nick's packaging. So plot thickens. It seems like more attraction's happening with respect to the the big players in the art community itself with Nick. So, yep, Bruno, I'm really excited to to not sure if your series is ever gonna end, but I'll be bookmarking in one way or another.
[00:43:34] Mike Thomas:
I hope it doesn't. That's awesome.
[00:43:36] Eric Nantz:
Yeah. There's much more than just that in this week's issue. Again, tremendous fun curating this for all of you. And, thanks to John Carroll again for his awesome utility. We call it the Curinator to help boost drive some of these feeds for us in a more systematic way with GitHub Action. So thanks, John, for making that for our curator team here. But, of course, our weekly does not move, does not live without all of you in the community. For your contributions, we invite you. If you see a great blog post, a great new package, or a great new tutorial, and you want the our weekly audience to see it, well, we're a poll request away talking about contributing. Right? You won't have to dive into any internals of R itself to do this. You just have to go to rw.org.
There's a little handy link to the draft right at the upper right corner. You can just submit a poll request with your markdown link all formatted for you and all set to go. That's a great way to contribute to the project. And as always, we are looking for curators as well. If you wanna sign up for that or get to know the process around that, we also have links directly linked at the top of each issue. Probably you can get involved with our weekly. And then, also, we love hearing from you and the community. We got the handy contact page and the episode show notes of this episode as well as with a modern podcast app like Pawverse or Fountain. You can send us a little boost along the way directly in your app to give us a little fun along the way, with with all of you. And then, also, we are sporadically on these social medias.
I'm more often on Mastodon these days. We're vet our podcast at podcast index.social. Sporadically on the weapon x thing, we've got the r cast. And on LinkedIn from time to time, cross posting the episodes and chiming in from time to time with some fun art projects.
[00:45:30] Mike Thomas:
Mike, where can the listeners get a hold of you? Sure. You can find me on LinkedIn if you search, Catchbrook Analytics, k e t c h b r o o k. You can find out what I'm up to. Or occasionally on mastodon as well at mike_thomas atphostodon.org. And, I guess a little episode cleanup, quickly, around 2 things that I had mentioned. I I think I did shout out at some point in the podcast, don't write tests. Please write unit tests. That was, course, satire and a joke. And secondly, I think I may have said that R is the 2nd best language for for doing just about anything. Obviously, it's the first best language for doing just about anything. So little clean up there.
[00:46:11] Eric Nantz:
I think it's implied, but, you know, it never hurts. Right? And, yeah, I expect the transparency on this show. Yeah. So we fully appreciate that, Mike, as always. And, yeah, I'm about to probably go through some more react shaving, if you will, of an internal project. But just as I think I'm at the finish line, I'm probably gonna find something else to to buy my time with. But, yep. Thank you as always for all of you around the world for listening, and we will be back with another edition of ROWG highlights next week.
Hello, friends. We are back with episode 155 of the R weekly highlights podcast. This is the weekly show where we showcase the awesome resources that are available every single week on this week's our weekly issue. My name is Eric Nantz. And as always, I'm delighted that you joined us from wherever we are around the world. And, yes, spring is in the air around here, and I'm feeling happy as always to be joined at the hip by my awesome cohost, Mike Thomas. Mike, how are you doing this morning?
[00:00:30] Mike Thomas:
Doing well, Eric. Yep. Spring is in the air here in the the East Coast as well. Trying to start planning some travel and some conferences and looking forward to, getting back and and seeing some folks that I will not have seen in a year here coming up, in the next couple of months. Summer feels like it's not that far away.
[00:00:48] Eric Nantz:
That's right. And this is a little bit closer to my, sports exploits. We're getting closer to hockey playoff season, and the nerves are starting to happen for my beloved bread wings to try and squeeze in a wild card slot. And it won't be easy, but we're we're getting the vibes. We're getting the positive vibes here. We'll find out.
[00:01:06] Mike Thomas:
Yes.
[00:01:07] Eric Nantz:
But as always, I just want you all about to talk about hockey stuff, but we're gonna talk about our weekly here and the awesome resources that we mentioned in this week's current issue. And, let's check the notes here. Oh, oh, oh, yep. That was me curating this week. In between random visits to, like, school libraries and swim meets. I somehow curated this issue, but I think we got a good one to talk about here. And I never am able to do this alone whenever it's my turn because I have tremendous help, as always, from our fellow r Wiki team members and contributors with your, I believe, 8 or so poll requests to this issue, which is very welcome. Awesome addition, indeed.
And I thank all of you that have been contributing to rweekly. So without further ado, we're gonna dive right into it, Mike, and I think you're gonna lead us off with a really fun exploration that has a lot of twists and turns that eventually involve patching the r language itself.
[00:02:04] Mike Thomas:
Yes. This is a blog post from Jonathan Carroll titled, I patched r to solve an exorcism problem. I didn't know where exorcism was going. I wasn't sure if we were getting into to religion or what sort of road we were going down here, but there is a website called exorcism.org that has, a lot of different challenges across many different programming languages that allow you to to try out a different programming language each languages each month and try to solve some sort of non trivial problems, you know, not just printing hello world in that language, but actually actually trying to solve a fun little toy, exercise. Sort of reminds me of Advent of Code, you know, on the on this particular website and, you know, Jonathan shows how he has been doing these exercises across a multitude of languages including Haskell, Go, Julia, Python, JavaScript, Scholar, Rust, Fortran, Lua.
Pretty pretty incredible work that he's done. Here he's gotten quite a few badges on exorcism based upon some of these challenges. And one of the the recent challenges was to write an algorithm that converts, integers into roman numerals. And probably in a lot of languages, this is something that's tricky. But for those who know or for those who may not know, there is a function in base r called as.roman in the standard base r library and it allows you to just provide that function with an integer and it will return what I thought was a string. I guess it's of class Roman, which is quite interesting.
But it'll provide you with the the Roman numeral equivalent. And that's pretty incredible. So I think Jonathan thought, you know, at this point, he's done. It was almost like, you know, a little cheat that he has in the R language to be able to very easily solve this problem and, it's a pretty short algorithm when it's just a single function that's already been implemented. And one of the wild things about, R's ability to work with roman numerals as well is you could assign, you know, the output of this as Roman function to a variable, and then you could do that again with a different integer that you're converting to a Roman, a Roman numeral.
And you can do math with those 2 different objects that are both these these roman, numeral objects. You can add them together. You can multiply them. It's it's pretty it's pretty incredible. I'm not sure how useful this is on a on a day to day basis. It's something I've never, I guess, had a a use case for, but I'm sure there's folks out there, you know, that that had a particular use case where it made sense to not only, you know, provide roman numerals to whatever that end output deliverable is, or maybe to even do some math on multiple roman numeral objects. So I I guess a pretty cool the more you know type thing with with base r.
So, you know, Jonathan realized, unfortunately, as he began to run some some tests to try to convert, numbers, I believe, 1 through 3,999 to Roman numerals, that one of the that one of the tests was failing. And, there was a mismatch between what he he was expecting and what the as Roman function returned because, the last a 100 integers from 3,899 to 3,999 returned NA values. And this was a little confusing. I guess, in a lot of other languages, they sort of state that, and this might be in R as well, I believe, that, any of their Roman numeral conversion algorithms really go up to 3,999.
That's sort of the the final integer value, that we have Roman numerals for. So Jonathan was sort of expecting, the the limit here to be 3,999, not 3,899. So we had to dive into the source code and and this is sort of where it goes from, you know, oh, I have this, problem on exorcism.org that are already has a nice little base function for As Roman. I got a one liner. I'm gonna get this this new badge and it it quickly cascades into, you know, in the spirit of yak shaving, quickly cascades into oh my goodness. Now I have to dive into this and it looks like I'm gonna need to submit a patch to r itself and becomes something much bigger, than maybe he initially set out for. So, Eric, do you wanna take it away with, the the patching to r?
[00:06:52] Eric Nantz:
Absolutely. And, boy, do I feel seen with the yak shaving analogy here because I literally have been going through this on an internal package at the day job where I just wanted to beef up the test suite a little bit. And, boy, did I know now I'm solely in the internals of Unix batch processes along the way. So Network tests. Yeah. Unit tests. Exactly. So, luckily, I knew how to patch it. I knew who it was responsible for it, but this is a little different here because John has indeed discovered that within the source code of r itself that's responsible for these roman numeral conversions, he did indeed see traces of the number not being 3,000999 9, but 3,899 littered throughout the code base.
Now you may ask, how on earth do you actually search the source code for R itself? Well, we are very thankful as a community that there is, on GitHub, a mirror of the r source code. I believe it's actually under Winston Chang's account still. It's called r dash source. You've been here before, Mike. I sure have. This is gonna be the bookmarks for a very, very long time. And in fact, it'll often turn up on Google if you're searching for a package of source code on a GitHub repository. Oftentimes, if the package is already on CRAN, this will be in, like, the top five results, this CRAN mirror of that said package. But, regardless, we're talking about the r source here. So taking advantage of that platform, John did indeed, like I said, search for where this number is actually showing up. And, yeah, it is showing up quite a bit, albeit some of these are what you might call false positives. They're not really having to do with that function itself. But in the typical GREP call, you did indeed find a lot of files in the source r library, both in in r files, c files, and documentation files.
So he did have to do a little more intelligent filtering to figure out just where all this is really taking place, and sure enough, he does eventually find it. And within format call and the like, but then he discovers there is a utility type function with the name roman.r. So very straightforward. And there sure enough, there is a comparison of the range being from 0 or greater than 3,900. There it is. He has found it. And sure enough, now what's the next step? Right? Well, r itself, we mentioned there is a mirror of the source code on GitHub. That's not actually where the upstream code lives for development. It's actually using the subversion repository.
Shout out to all those who use subversion in the past. It's been a while for me, but that is where if you are wanting to learn about contributing to the art project itself, you're gonna have to pull down that SVN mirror to your local machine and then run a patch through SVN. The and there's an SVN diff, patch there, I believe. The command I'm rusty with my subversion coding here. But, when John reached out to the maintainers, on the mailing list, for r, they did recommend, hey. You know what? It looks like you're on to something. Please submit a patch and file a Bugzilla report. Just like we talk about for contrary to open source in general, finding the best way to reach a project and making sure that issue is tracked and then there's actionable feedback on that, that's the way to go. Right? So John is following the protocols that have been established by the our project team to submit this report.
And then now comes a part well, he submitted it. Now you wait. Is it gonna get merged in? Sure enough. It does get merged in. This is exciting stuff here. Right? John has literally patched the R language itself for this issue. Now as you think about, well, will this really work? What's a great way to test if your patch is gonna work? Well, guess what? Comes containers again. John discovered, you know, what I've been using for years now and, you know, that our community has been using for years is being able to bootstrap particular R versions with Docker and particular the Rocker project to be able to check if this patch is indeed going to work on the upstream version of R that's coming from the bleeding edge of the subversion repository.
He was able to pull that down into a container and then verify that his patch actually indeed works. So what's next? Well, obviously, when the next version of point release of r is released, this patch will be included in it. When that happens, it'll probably be later this year. But this process this blog post illustrates such a unique story here in terms of the nature of open source and the fact that one little learning exercise turned into patching the language itself. But John concludes the post with some really great advice if you find yourself in a similar situation in the future, whether it's in our package or another language entirely.
First, don't always assume that the language itself is broken. Of course, you want to check that you haven't misspecified some. So read the documentation, run some additional tests. That's always helpful. And then when you do think you've pinpointed something, guess what? Nature of open source, go into the source code itself. And, yes, we have learned that, yes, even with the R source code, the base R source code, there are ways to grep that both on the GitHub repo and also, through Linux utilities like grep and the like. So having a good knowledge of that is extremely helpful for some of these niche bug bugs like this. And then don't wait to communicate.
Again, John reached out to the mailing list, put out what he was finding in his explorations, got a response from the maintainer, able to get direction on how to proceed next without, you know, going too far without that buy in. And that can happen sometimes. Some people can submit patches without checking with a maintainer first, and then there might be a little disagreement or maybe other work that wasn't merged in earlier. Always communicate early. Nothing bad can happen. My opinion from communicating early on this. And then, yeah, if you find this issue, of course, if you have the capacity, it's excellent. If you can ease the burden of the maintainers to fix the patch yourself, sometimes you might need a little help. And, again, don't hesitate to ask. Maybe a code review, maybe another test case that you like someone to assist them with.
So I think this post is a terrific story of how to go about this process. And, yeah, don't be afraid of communicating with the R team on these issues because guess what? Like anything open source, it's not like they're gonna be able to catch everything themselves. And sure enough, this this hard coded limit went through r for years years years for roman numeral conversion without somebody really discovering it. So better late than never. Right? But with open source, you can, you know, do your part as a user and as a contributor to make that fixed and benefit everyone else in the process. So, again, if nothing else, also count John's blog post because there is some gratuitous, very fun Simpsons imagery too that always warms my, retro viewing hearts.
[00:14:47] Mike Thomas:
Yes. No. I I really appreciated sort of those last points that Jonathan made to to talk about, you know, how he was able to succeed and and maybe those those 4 different things that he recommends. You may consider if you find yourself in the same situation and it's a pretty empowering thing, right, to be able to because we live in open source world to be able to, you know, contribute and submit a patch to the the our, language itself that you know, you know, thousands, millions of people are going to to use and be affected by. That's that's pretty incredible. And I think Jonathan's put together a pretty nice road map here to help you do that if you find yourself in a similar situation. I think you you may need to to turn back time to, you know, about 2,005 to use SVN in a mailing list to do so. But, we we gotta use the tools tools that we have, and, that that's just a that's just teasing.
[00:15:42] Eric Nantz:
I I would say sometimes it can be intimidating to figure out, okay, just how deep does this rabbit hole go. But sometimes with a little perseverance, it does indeed pay off. This was really, really interesting interesting exercise. And and you know what? I'm gonna bookmark that exorcism site. That that is some really top notch ways to hone your programming craft. So nice find there as well. I agree.
[00:16:20] Mike Thomas:
Eric, you know what else is interesting? The results of the 2023 data dot table survey.
[00:16:26] Eric Nantz:
Oh, yes. And this is a good callback to just a few weeks ago. We were mentioning how the data dot table project was indeed, you know, revamping some of its governance and making easier and more transparent for ongoing road map ideas and how users can contribute. So, of course, what's the best way to hear how users are receiving your package and wanting either suggestions for improvement or what? And that is to release a survey earlier in the year. And this blog post is coming from the data. Table blog and, in particular, the author, Alja Sluga. And he starts off with, first of all, thanking everybody that has filled out this survey, and they got almost 400 responses, which is really nice for a survey like this. And we'll walk through a couple of the key findings here and where it might relate to the data dot table project in the future.
There is the post leading off a little bit of demographic style information showing that, you know, the majority of users did have an a very much an experience set using r for 7 plus years and data dot table, you know, quite a bit in that time frame as well. And then, you know, many are using it every day that responded to the survey, so there might be a little bit of selection bias here going on. But, hey, it's always good to quantify that information. And then he gets into some of the, you know, the the tangible feedback itself. And there were very specific questions, but there was very a very obvious kind of trend that came in terms of what users appreciate the most about data dot table, and it's something that actually brought me to some use of data dot table in my early days of our programming.
That is performance. It is very memory efficient. If you've been down the road of having that massive CSV or other text file and having the base r, read dot CSV, crash your r session because of memory limits, well, data dot table has always been very efficient in this space. And when people need speed, they turn to data dot table more often than not. And then another positive feature, which ironically has another side to the coin to bear on your perspective, is the syntax of data dot table itself. I think, Mike, you and I agree that it is very unique in the syntax as compared to other frameworks in the R language.
But when you invest in that DSL, if you will, you can accomplish a lot in a pretty concise way. As for me, I'm just not a regular Data. Table user. So I do identify with some of the feedback that we're seeing in this post of those in the community having to look it up most of the time to figure out how to do certain operations. Again, there is some great documentation out there. It's just for me, not muscle memory yet of how to implement the syntax. So, again, it's good to see kind of tangible data showing these different trends across a different spectrum of user bases here.
And, overall, it looks like people are pretty satisfied with the with the package itself. Again, not everything is perfect. Again, performance is becoming one of the most favorable areas. But then you might see some, you know, some not so great issues as well. In terms of desired functionality, there were some feature requests out there. And, Mike, why don't you take us through some of what the users are kinda hoping for in the future in data dot table?
[00:20:03] Mike Thomas:
Yeah. Absolutely. You know, I think one of the the most insightful charts for me in this blog post is this importance versus satisfaction, plot, which is really interesting. And I think just to to highlight and to sort of summarize, the the feedback from the community, you know, the the sort of data point here or feature of data dot table that had the highest importance rated with the highest importance and had the highest satisfaction, was performance. And then, you know, lower on the important side, but high in satisfaction was the the minimal dependencies, which is absolutely a strength of data dot table.
And then higher on the importance, but lower in satisfaction, so I think these are things that, hope, you know, respondents are hoping that data dot table may improve would be, you know, the docs and the legibility of the syntax itself. So in terms of that desired functionality that they're talking about, one would be support for out of memory processing. So I think this is something that, you know, has has come to light, especially with, I believe the arrow package. Does that do out of memory processing? I believe so. Yes. Okay. So that sort of allows you to operate on disk, operate on the file on disk without bringing it all into memory first. You know, folks are looking for richer import and export functionality with parquet sort of being the the most commonly mentioned, item followed by good old xlsx format.
And then, the last We can't escape the spreadsheets, can we? Oh my goodness. And then the last, piece of desired functionality that they have listed here is integration with the pipe operator. You know, which also lined up with, you know, how some of the questions around how, much folks are using the pipe, and I imagine they're talking about mostly the the native pipe here. Most respondents here or or the majority of respondents are responding that, the pipe is is very useful to them and they would find some sort of a convenience function for using Data. Table with the pipe, to be very very helpful.
And then there is this notion of the I don't know how these things get these names, but this is the name that's been around for forever. But the the alias for the walrus operator, which is just a colon followed by an equal sign. And I guess the that sort of lines up with data dot table's mascot. Right? It's a walrus? That's right. Yeah. Let's go synergy there. Yes. So I I think folks who were looking for maybe a, you know, a more plain English, alias for that operator, with some of the options being either set, let, or set j.
And set seem to be, you know, the function name that would provide an alias for that Walrus operator to be the most, popular response there. And then sort of the final chart in the way that, this blog post starts to wrap up is on the topic of actually contributing to Data. Table and to gauge folks' interest to actually contributing to the project or their contributions in the past. I guess not surprisingly, you know, spreading the word about data dot table and and just reporting issues were sort of the the top two responses in terms of what folks, would be interested in and then what maybe they have the capacity to do, you know, followed by actually contributing to the code base itself. So, you know, some users, I guess, in conclusion, are are are a little worried that the package may be abandoned or stagnating.
One thing that I would wanna say that, you know, I've seen on social media before is, like, you know, this is the next iteration of Language Wars. It's now, like, oh, dplyr or Data. Table and, you know, you have to be in one camp or the other. And if you're in one camp, you have to not like the people in the other and vice versa. And I think that's that's absolutely ridiculous. I hope that that doesn't really exist. And I would say that, you know, like anything else, it's amazing to have options and use the tool that that fits your use case and fits your comfort the best. You know, data dot table is fantastic if you wanna use it. If you wanna use dplyr and and Arrow or, you know, DuckDV, you know, you can use that too. So, I I think as long as the community continues to to rally around the package, and I think initiatives like this one here to try to get feedback and to understand how it can be improved will go a long way towards the longevity of data. Table as well.
And I know that they have done a lot of work on this package around documentation and community, just in the last maybe 6 to 12 months. So excited to see these results, and, I I think the community is strong.
[00:24:58] Eric Nantz:
Yeah. And lots of positive momentum, like you said, this year with some of the steps they're taking. And not that it was very whacking in any way, but this is you know, as as open source projects evolve, you do often have, you know, newer contributors or newer users come on board and looking at what are the available options for, say, data processing, data manipulation. And it's always great to have choice in this space. I know sometimes in my industry, there are some people, they get a little, you know, maybe confused about having so many choices in domains. But you know what? For your specific project, if data. Table fits your needs and, boy, I remember many days of importing some huge textual, you know, biomarker data files and data. Table was as fast as could be in that space. And, yeah, we have lots of great code bases that leverage that package heavily. So I'm always of the mindset if it ain't broke, don't fix it. And then, also, with respect to data dot table maintainership, yeah, it is alive and well. They are really spreading the message out for various channels, and this survey should serve as a reassurance to everybody that they are really thinking of the users in mind, both those that have been using data dot table for years upon years and those that are coming new to the project because they are both equally important in the lifespan of this space.
And, certainly, I'm really appreciative of the transparency, and I'd see nothing but great things happening for the project going forward. And the fact that they're sharing this more actively, I think, is a is a huge step, to bringing this, not that it wasn't a first class citizen before, but really putting this into the mind share of most of the R community. I think data dot table, the project itself is doing great things to make that happen.
[00:26:47] Mike Thomas:
No. I I agree as well. Lots of positive momentum, lots to look forward to, and in no way is this project doomed.
[00:27:05] Eric Nantz:
Well, luckily, Mike, we're not doomed in terms of the rest of this episode because we do have some fun things to talk about here, especially on the visualization side of it. But, of course, you listening, maybe you're wondering why the heck are we talking about doom and gloom here? Well, we're not referencing that kind of doom. We're kind of referencing some that did, be a part of my retro gaming heart mech in many, many years ago in my college days, getting together with some friends and playing the heck out of the doom, game by id Software that was often a trendsetter for all these first person perspective games on here. Now just where does this have to do with r itself? Well, our last highlight has kind of done this very interesting geometric type exercise for just how do maps could be created in the context of R itself in the aspect of 3 d style visualizations.
Now Mike and I had to do a bit of detective work on this, but we're pretty certain that this blog post has been authored by, Ivan Krylov. But we admit we could not find any trace of that on the blog post itself. We did some spoofing on their GitHub repo. So, hopefully, we're correct. One way or another, we're gonna go with that for now unless we hear otherwise. But Ivan leads off this post about talking about when would you want to visualize in a 3 d type landscape a function surface. So you may be thinking, if you had experience in this space, kinda like a contour map where you see, like, the elevation in a in a map setting. In fact, it reminded me of a lot of the packages that have been developed such as ray shader and ray render and the like have been doing a lot of those 3 d visualizations in R itself.
And guess what? Base R itself comes with this built in. Especially if you're using the extension packages like Lattice. There is a way to do contour plots in that. The RGL package in the R community helps you do 3 d plots in R. But, you know, we could he he thought we could just do that, but let's let's make this fun. Let's make make a do map out of it. Now I've only seen the end product of a do map, but just what does that really entail? Well, Mike, we're going to geometry school for a little bit on this one, so buckle up here. But, apparently, in the first and second iterations of doom, there was no concept of a floor that could go up a hill or down a hill.
So, apparently, you would have, like, the sky for height, you know, but then you'd have your tiles at a certain level, maybe done at another level in a stepwise fashion. And, of course, R itself in terms of how you would visualize this is not gonna be coming with everything out of the box. So there are some open source utilities, called Zdoom and Zanodrome, which are apparently gonna help with the overall visualization of this before we feed into it in the r itself. But here comes the geometry, school at play here that a do map is gonna have a series of points or vertices, lines, sides, and sectors.
And, yes, there are obviously point coordinates for the vertices of an x and y, and you got lines connecting them. And then you've got the sides that are available to the user when they look left or right. And then, also, there will be textures, but that's not really the point of this post. And then where the actual how the height information is presented, and those are called sectors. So Ivan's original idea was to start with the contour lines package or contour lines function and then tried to kind of makeshift some, you know, artificial slope involved to get to the heights of this. But, apparently, it didn't quite cut it where the editor was trying to fix some things that were missed in the translation.
So he kinda had to go back to the drawing board and start to go with something more universal with respect to doom maps, and that is literally called the universal doom map format, also known as text map, where then it can store the additional information of the heights of these points and not just the x and y coordinates on kinda like the lower plane, if you will. And then it gets to be really math heavy or geometry heavier because, apparently, you need to be able to split these maps of the height into triangular shapes.
We're literally and the blog post has this, a great illustration of splitting a rectangle into 2 triangles of equal area with the vertices, you know, interpolation along the way and then become some clever use within base r of the array function, capturing data frames of these x, y, and now z coordinates that capture the height of the contours of these planes. And then a lot of more manipulation to start to figure out how do we connect all this together. Lots of custom data frames being created here. Lots of other temp files being created here for that mapping utility. And then once he's able to feed in these variables into the mapping software, yes, at the end, you have yourself a doom literal doom screenshot of he fed it into this open source utility that I mentioned earlier.
I think it was more more manual processing of another utility called SLADE. And sure enough, there is a a reproducible rscrubber. If you have that same map emulation software, you do get a shot. The player looking at a contoured hill with looks like from the game itself, like, I wouldn't be able to tell the difference. Like, you're some overworld type area. So I admit I have never thought to try anything like this, but guess what? If you wanna try this out with the right software installed on your system, the our script is downloadable. You can check it out yourself and give it a shot. And, yeah, maybe it's a great way to boost your geometry and mapping skill set at the same time and having some fun along the way. So, hopefully, Ivan, we're getting your name right here, but, thanks for opening our eyes to use of r that I never thought I'd see happen in my lifetime. But guess what? There's nothing that r can't do. Right, Mike? Absolutely. And it's incredible how much of
[00:33:50] Mike Thomas:
what's generated here is from base r's plotting functions as well, and just, you know, sort of vectors and and things like that. If you download this r script, that's linked at the end of this blog post, it's it's fairly concise, I think, you know, what's necessary. He has these 3 different functions, triangulate as text map, and then the final one, image to doom, that spits out a file that I believe you can pass to this software slate or something like that that'll help generate, this exact image that we're seeing on screen. Fairly concise. It's a really cool, you know, I just I'm really enjoying reading the code here.
I learn something new every day. Today, I learned that there's a function in r called is dot unsorted to test if, the vector that you pass to it is sorted in ascending order or not. I'm not sure if I have any use cases for it, but I'm certain that probably sometime in the future, I will. The code comments are are incredible. He has an a beautiful actual diagram in the code comments here, plotting this coordinate map. Just just literally using comments and characters on your keyboard. That's absolutely fantastic, and it lines up with, the diagram that's in the blog post under the triangular sectors, section.
So, you know, really interesting use case. I'd be interested to see sort of how maybe you could take this to the next level with RayShader, and then, you know, maybe make doom look like it's in the, you know, 2023 sort of graphic state. You know, Eric, to be honest, I don't wanna date you, but I'm not familiar with doom. Halo was probably my my first, you know, the foray into first person shooters, if you will, on the old Xbox 1. And, even all the way back then, I think the the graphics were a little bit of a step up than, than than what we have in Doom. But, you know, I'm sure I I'm sure I would have enjoyed Doom if if I had been there.
[00:35:53] Eric Nantz:
Yeah. I think I dare say I would have. And, that if if if that was dating me too much, and I better not mention Wolfenstein because that even predated Doom and that, Id's first entry into the FPS space that kinda changed the world. But, yeah, if you're you talk about going step back in retro graphics. Yeah. That one's a bit hard on the eyes. But, but, yeah, we've actually seen very interesting use cases of games like this where maybe it's not so much the actual end product that you can get, but they are extendable via mods and things like that. And that's where having code like I believe the doom code's in the public domain now. So, like, you could literally browse this yourself, and, hence, you see the modding community go to town on things like this. But but, yeah, I I definitely got the same same vibes as you did, Mike, about how you could combine this with some of the awesome work of, like, Ray Shader and the like to really beef up a a fun demonstration that's built entirely with R itself.
But, yeah, I did take a look at the script. Like you said, that's available for download. Very well commented. And, yeah, easily reproducible with the right software. So I think this is, again, if you thought R wasn't able to do certain things in terms of visualization that combines with retro gaming, well, this post has definitely solved that for you. Yes. I'll have to check out Ivan's previous post because he's definitely got a a great selection of additional topics with respect to, you know, integrations with c, looks like, others on on contributing to r itself. Yeah. Lots of great nuggets here, and, I'll definitely keep this bookmark.
[00:37:33] Mike Thomas:
You know what? You know what I always say? R is the 2nd best language for doing just about anything. Conan said it better myself.
[00:37:40] Eric Nantz:
And maybe And what may be the 2nd best resource for everything in r might be rweekly itself because we have a mix of everything as well from highlights of what we talked about today, our awesome interesting, use cases via blog posts, tutorials, new packages and updated packages, and the like. So we'll take a couple minutes to talk about our additional finds here. And for me, this isn't so much our specific, but we alluded to it earlier, Mike, that it is conference season. It's starting to get underway with various conferences out there. And maybe you, are like me, especially in my earlier days where I would go to these meetups for the first time, I'm a bit of a shy dude, I must say. So, you know, what's the best way to kinda feel comfortable and, you know, ways of connecting with others? Well, my additional find here is from the jumping rivers blog authored by Rhian Davies and Keith Newman called an introvert's guide to networking at a conference.
So this is very nice very nice way to kinda ease that, maybe, that little fear or apprehension you might have at the beginning of these events and how you might navigate certain situations, how to keep contact with people that you do end up networking with, you know, what are some ideas for icebreakers and whatnot, And not to feel too much pressure if you're being sent on behalf of, say, your organization that you're a part of, but really is trying to soak in that experience in an optimal way. So, yeah, I I definitely resonate with a lot of these points here. And, also, I'll mention a a a heads up that we often hear at the various posit conferences is the idea when you're in a group setting, having the Pac Man rule, having, like, an open slot so that people can join your group to to join in on the discussion.
Things like this with practice really do add up and help make you feel a lot more comfortable. So really great post by the jumping rivers blog, and, yeah, it'll be hopefully a Pazitconf would be my next, in person event, and I'll be taking this to heart like always.
[00:39:44] Mike Thomas:
I like that one a lot. Another one that I found was from El Saman on the key advantages of using the key ring package. And the Keyring package allows you to essentially store secrets, that are retrievable, I think, through environment variables would be the most common way to do that. And, you know, one of the differences between using Keyring and and maybe using a dot r environment file that would be, like, project specific is with Keyring, you can store that particular secret once and for all per computer that it's on, which which is nice. You know, you don't necessarily have to do that on a project to project basis.
You also do not have to worry about somebody accidentally, forgetting to git ignore that dotrEnviron file and it making its way up to, GitHub or GitLab or whatever sort of hosting service that you use for your git repository. So that's a nice feature as well, that you may be interested in in leveraging as opposed to sort of doing the, the old hard coded way with, you know, sys dot, get env and, setting environment variables that way. So it might be interesting for some folks who are are looking to, brush up on their best practices around security and environment variables and passwords and secrets and all that stuff.
[00:41:04] Eric Nantz:
Yeah. This is terrific when you're using R and, like, a traditional client kind of setting where you may have a a team using RStudio IDE or whatnot on your local machine. The key ring package is gonna be instrumental to helping, like you said, keep some of those credentials secure and not nag them all the time for it and minimize the potential for leakage. Unfortunately, I don't think this would be a way this would be compatible with, like, a Shiny app that's deployed on a server somewhere, but I will have to look into this a bit more because I know the Keyring itself is used every single day. I go on to my Linux system here at home. I often have to prompt once for my administrative password to do a certain task, but that's being stored in the Keyring credential store and not anywhere else. So lots of lots of ways that I'm sure this could be used, that I'm probably not even aware of. So great find as always.
[00:41:58] Mike Thomas:
And then we have one more that I think we would be remiss not to mention, at least to give a quick shout out to Bruno Rodriguez. We are at part 10 of reproducible data science with Nix, and the discussion here is on contributing to Nix packages. So if you have been following along with Bruno's saga, and crusade on getting folks to check out next next for for doing reproducible data science and having that that fully sort of reproducible environment, that you can come back to, you know, years from now and run your code, and it'll still output that same thing. Check out part 10. It's the latest in the series, and it will not disappoint.
[00:42:38] Eric Nantz:
Yeah. It goes so nice. So we talked about earlier with the idea of patching such an influential project as the R language itself. But guess what? Yeah. NIX, the momentum keeps coming. And, yeah, I was even doing a little poking unrelated to NIX itself when I'm then continuing my efforts with this shiny application as a web assembly bundle for my R Consortium work. And I'm poking around the WebR repo, that George Stagapos has been working on. And I see a commit saying they've made it or I should say it was Shiny Live, Shiny Live for R. I see a commit that they are making things compatible with Nick's packaging. So plot thickens. It seems like more attraction's happening with respect to the the big players in the art community itself with Nick. So, yep, Bruno, I'm really excited to to not sure if your series is ever gonna end, but I'll be bookmarking in one way or another.
[00:43:34] Mike Thomas:
I hope it doesn't. That's awesome.
[00:43:36] Eric Nantz:
Yeah. There's much more than just that in this week's issue. Again, tremendous fun curating this for all of you. And, thanks to John Carroll again for his awesome utility. We call it the Curinator to help boost drive some of these feeds for us in a more systematic way with GitHub Action. So thanks, John, for making that for our curator team here. But, of course, our weekly does not move, does not live without all of you in the community. For your contributions, we invite you. If you see a great blog post, a great new package, or a great new tutorial, and you want the our weekly audience to see it, well, we're a poll request away talking about contributing. Right? You won't have to dive into any internals of R itself to do this. You just have to go to rw.org.
There's a little handy link to the draft right at the upper right corner. You can just submit a poll request with your markdown link all formatted for you and all set to go. That's a great way to contribute to the project. And as always, we are looking for curators as well. If you wanna sign up for that or get to know the process around that, we also have links directly linked at the top of each issue. Probably you can get involved with our weekly. And then, also, we love hearing from you and the community. We got the handy contact page and the episode show notes of this episode as well as with a modern podcast app like Pawverse or Fountain. You can send us a little boost along the way directly in your app to give us a little fun along the way, with with all of you. And then, also, we are sporadically on these social medias.
I'm more often on Mastodon these days. We're vet our podcast at podcast index.social. Sporadically on the weapon x thing, we've got the r cast. And on LinkedIn from time to time, cross posting the episodes and chiming in from time to time with some fun art projects.
[00:45:30] Mike Thomas:
Mike, where can the listeners get a hold of you? Sure. You can find me on LinkedIn if you search, Catchbrook Analytics, k e t c h b r o o k. You can find out what I'm up to. Or occasionally on mastodon as well at mike_thomas atphostodon.org. And, I guess a little episode cleanup, quickly, around 2 things that I had mentioned. I I think I did shout out at some point in the podcast, don't write tests. Please write unit tests. That was, course, satire and a joke. And secondly, I think I may have said that R is the 2nd best language for for doing just about anything. Obviously, it's the first best language for doing just about anything. So little clean up there.
[00:46:11] Eric Nantz:
I think it's implied, but, you know, it never hurts. Right? And, yeah, I expect the transparency on this show. Yeah. So we fully appreciate that, Mike, as always. And, yeah, I'm about to probably go through some more react shaving, if you will, of an internal project. But just as I think I'm at the finish line, I'm probably gonna find something else to to buy my time with. But, yep. Thank you as always for all of you around the world for listening, and we will be back with another edition of ROWG highlights next week.