The summer schedule has been crazy, but we finally have a new episode of R Weekly Highlights! In this episode: How the new shiny2docker package eases your entry to the world of containers, the power of WebAssembly in full ggplot2 glory, and how the latest solution for speeding up R code draws upon a classic computing language you may not expect.
Episode Links
Episode Links
- This week's curator: Eric Nantz: @[email protected] (Mastodon) & @rpodcast.bsky.social (BlueSky) & @theRcast (X/Twitter)
- Containerizing Shiny Apps with {shiny2docker}: A Step-by-Step Guide
- ggplot2 layer explorer
- {quickr} 0.1.0: Compiler for R
- Entire issue available at rweekly.org/2025-W24
- {attachment} - Tools to deal with dependencies in scripts, Rmd, and packages https://thinkr-open.github.io/attachment/
- The Rocker Project - Docker Containers for the R Environment https://rocker-project.org/
- r2u - CRAN as Ubuntu binaries https://eddelbuettel.github.io/r2u/
- ShinyProxy https://shinyproxy.io/
- GitHub repository for ggplot2 Explorer https://github.com/yjunechoe/ggplot2-layer-explorer
- Use the contact page at https://serve.podhome.fm/custompage/r-weekly-highlights/contact to send us your feedback
- R-Weekly Highlights on the Podcastindex.org - You can send a boost into the show directly in the Podcast Index. First, top-up with Alby, and then head over to the R-Weekly Highlights podcast entry on the index.
- A new way to think about value: https://value4value.info
- Get in touch with us on social media
- Eric Nantz: @[email protected] (Mastodon), @rpodcast.bsky.social (BlueSky) and @theRcast (X/Twitter)
- Mike Thomas: @[email protected] (Mastodon), @mike-thomas.bsky.social (BlueSky), and @mike_ketchbrook (X/Twitter)
- WillRocky - Return All Robots! - WillRock - https://ocremix.org/remix/OCR02280
- The Unnamed Frontier - Metroid II: Return of Samus - Pyro Paper Planes, Viking Guitar - https://ocremix.org/remix/OCR02892
[00:00:03]
Eric Nantz:
Hello, friends. Yeah. It's been a minute, but we are finally back with another episode. In this case, episode 206 of the Our Wicked Highlights podcast. We have been off for a bit, but this is the show where we talk about the excellent highlights that have been shared as well as some additional things as time permits in this week's Our Weekly Issue. My name is Eric Nance, and hopefully, you didn't forget my voice. I feel like it's been a bit since we last were able to speak to you all and your favorite podcast players. I've had, definitely a whirlwind of a few weeks, and I just would say a note of future self agreeing to do three conferences in a matter of two weeks may not be the best for your, your stability, but I got through it. And the last one was especially fun. But, nonetheless, I am here.
But I am not alone. Thankfully, who's able to carve up at a time today after a bit of chaos is my awesome cohost, Mike Thomas. Mike, how are you doing today? Doing pretty well, Eric. Apologies to the listeners. We'll probably sound a little bit different, today. I'm
[00:01:04] Mike Thomas:
using, unfortunately, my earbud headphones to record instead of my typical microphone, which has officially bit the radish. And I think this is one of the byproducts of having not recorded for a couple weeks as I decided to test out my microphone, about five minutes before we jumped on and realized that it is officially out of commission. So I will be hitting up Amazon very shortly
[00:01:26] Eric Nantz:
and should sound better next week. No. We we know how it goes. And just like with infrastructure, we've been spinning up. Sometimes when it fails, it fails at the most inoperative times, and we ought to put those, put those things out. Well, great for you. It's an Amazon, click away. Some other times, there's a lot of blood, sweat, and tears, so to speak. We're getting everything back set up. I've been down that road too, especially in my Docker adventures literally right before I record here, but I think I got it solved. I anybody that runs Docker on Windows, we're gonna talk about this a little bit. You definitely have my sympathies. Man, WSL is a thing, to deal with. Anyway, I digress. I tend to do that from time to time. So let's look at the notes here. It has been a minute, Mike. We always do our, prep show notes together.
Let's look at the curator here. Well, of course, it would be. Whenever I have so many things going on, usually, that's the time when I have to curate an issue. But thank goodness, I was able to get one out there, and we got a lot of great selections to talk about. So this is a pretty easy one for me to get out to the door, but as always, I could never do it alone. Had tremendous help from our fellow, our rookie team members, and contributors like all of you With in this case, I think it was a whopping five or six poll requests that had some great resources that has been shared in this week's issue. So without further ado, let's get right into it. I already teased it out. Right?
There is a novel technology that's, you know, hit the software development landscape for the last almost like fifteen years it seems like, and that is the world of container technology. The way that you can encapsulate not just a quote unquote application or a library, but its own system dependencies and what you might say is a cross between a full blown virtual machine on your system any more fit for purpose architecture. In this space, the Docker, runtime for containers has been what usually gets the most attention. It's, had the most mindshare, you might say, but it's not the only game in town, of course. There are some other ones like Podman, which has a lot of compatibility with Docker.
And if you wanna get really low level on Linux, you can use what's called LXC for those kind of containers. As an R user, you may be wondering what errors what are some of the best use cases for this? And I still stand by what I'm about to say is that in the context of shiny development, containers are a massive help. It opens up a lot of doors, a lot of possibilities for I can scale this to use those cliches, and also more transparently, actually where you host it. There are a lot of options available to you. You may be new to Docker and you're wondering, oh, gosh. I have to be a Linux expert to get this started as an R user.
Not necessarily. You can grow into it because our highlight is talking about a fantastic package to kinda ease your way into the world of Docker, especially in the context of Shiny development. And it comes to us as the package called Shiny two Docker with the number two in the middle. And we have a great blog post from the team at think r, in particular, Vincent Guider, who I believe is a coauthor of this package to introduce Shiny to Docker and give us the use case of just why it's important and in the mechanics of how it works. The team at ThinkR has been using Docker for quite a bit. We've I've had multiple conversations with the esteemed Colin Faye, also a fellow curator on the art weekly team on his container adventures, and, of course, web assembly adventures on top of that.
But Docker has been a key focus in one of Mike and I's, favorite packages, Golem, to help with those deployment concerns. And I'm gonna I'm gonna say it again because I can't resist. Once you go, Golem, you never go back. Waste of my humble opinion. I'm getting a virtual applause from Mike here, so I'm not too off base here. So they've always had hooks to use Docker, to help generate a Docker file for probably past year or so. But now they've encapsulated that functionality into a standalone package, hence Shiny to Docker. So what does it get for you on on the tin, so to speak?
You get to have what's called the Dockerfile, which in essence is the instructions that you give the Docker runtime to build your, what you might call image or application based image. Shiny the Docker is gonna help automate that process to create it for you. It's also gonna interact nicely with RM for rev. We're out of the box. It's actually if you don't have one already, it's gonna generate an RM lock file and detect what versions of packages you are using on that application library and boost strap that for you to ensure that reproducibility of the package version.
I do have additional thoughts of that towards the end. I'll save it then, but what's also nice is a shiny to Docker in the role of the CICD, especially with GitHub actions, get lab runners and whatnot. It's going to help you create either a GitLab CI configuration or a Git GitHub actions workflow YAML file, which is wonderful to not only build a container on that infrastructure, but also they'll push it to a registry of sorts for these container images. The most prevalent one being the Docker hub as one of those. But you could also choose additional repositories as well.
And that is a huge deal because then with that image in place, you can then choose where you deploy that. So how do you get started? Well, of course, you're gonna install the package with the typical install dot packages because it is on CRAN as a record. So you can get it straight from CRAN or our universe or other places like that. And then the rest of the blog post gives us kind of a case study for preparing a relatively simple shiny application, where it's in a simple app dot r, very much the hello world type geyser, oh, faithful, I should say, histogram that we all know well very well.
And then once you have that encapsulated, it's time to generate the Dockerfile. It's simply a call to shiny to Docker. Give it the path to where your app dot r is or rest of your Shiny files, and then it kinda takes care of the rest in a very use this like fashion. Gonna give you the Dockerfile ready to go. A special text file called dot Docker ignore, which if you're familiar version control with Git, you can have a dot git ignore to tell Git don't touch that file. Environment variable file say hello. Don't ask me how I know. But you can have the same thing with Docker ignore. There may be some things that your pack your application has that just were more for development. You don't already want that in your Docker image because you do wanna keep the size in a reasonable image size because that's gonna, you know, potentially long your download times when you actually call this to run.
And then with throughout the step, it's got a lot of nice kinda air, you know, checking. Like I said, it's bootstrapping RM file of the box for you to help with getting that lock file, and then it's gonna give in that Dockerfile the command to actually run the application using the Shiny run app function, which we typically put, like I said, at the end of these Docker files, these instructions. And then once you have that in place, then you can step back to Docker itself, run the assuming you have Docker on your system, which again is a prerequisite. I do wanna say, in my humble opinion, working with Docker is much easier on Linux than some of the other operating systems, whoever Mac or Windows. But I feel especially, sympathy to the Windows users because you do need the Windows subsystem for Linux version two of that in particular to run Docker on your system.
Trust me when I say that can be finicky depending on your corporate environment. So best of luck with that.
[00:09:49] Mike Thomas:
Mike might have something to say about that. I'd say Casper can help. We've done that a lot. But, yes, you're exactly right. Once you get WSL two installed, I would just highly recommend installing Docker Desktop to automatically install the Docker engine and give you a nice UI into all of your running containers and images as well.
[00:10:08] Eric Nantz:
Right on. It's exactly what we are recommending. Some of you may be aware as a side tangent, but I am part of the submissions working group under Doctor consortium. Literally, as I record tomorrow, I am set to transfer the next version of that pilot with the Docker version of a Shiny app. And, yes, you better believe in my instructions for the reviewers, we have install WSL and then install Docker Desktop. We need the easiest way for them to get containers running. So, yes, as a a very, on the point recommendation. So once you have that in place, you're gonna build the image with docker build. You can give it a what's called a tag, like a little label for that, and it's gonna inspect that docker file and then let the magic run, so to speak. Build it where by where. And then once you have that image in place, then you can use that at any point on your system with the docker run command.
Give it a port that you want your Shiny app to be exposed on. Give it a name if you wish, the name of that image afterwards, and then you're gonna get a process in the terminal to say, okay. Your app is running on local host or, you know, port whatever. Then you can browse to that port. Typically, you can map that to port 80 if you wanna make it super simple, but you can choose any port really at that point. And then you've got yourself a Shiny app. It should work just as if you were running this in your R environment, like in our studio or a positron or or whatnot, it should look just like that. That's the whole idea. It's just another way to execute your application, but your application source code is the same.
So the post concludes with some nice kinda practical tips to give you the most smooth process to get yourself in this container journey. One yeah. As you're iterating, I'm sure you're gonna have, like, cruft in your app directory as you're tripping stuff. Once you're done, like, you've got a stable thing, try to keep that relatively clean. And when necessary, add things or add directory names or file names to the docker. Ignore if they're really not meant to be used by by the container. They do recommend, and I actually I'll hardly agree in this context. RM from the start just outside of this Docker file or shiny to Docker pipeline can be helpful. Although although, I just gave a talk at our medicine about how I think Nyx honestly can be an even more attractive alternative to this. Mike someday will be converted to that. We're not there yet. But I got proven, in multiple projects now.
Next can also fit nicely with Docker as well, where you can have the next help of your development libraries for the app and even at the system for that matter. And Docker, instead of using r and r m to boost wrap that, You can just boost wrap that same next recipe if you will. You'll get that same version of packages. Myself and Bruno Rodriguez will be doing our talks at our medicine. Hopefully, the recordings will be out soon. We think in Shiny, there's a there's a pathway there. So I digress. Other great things to think about, definitely look at things locally before you, you know, rely on CICD to help with a lot of this. Trust me when I say where things can happen in your CICD scripts or it's the YAML instructions that you give it or like me or go hunting for punishment and put some bash shells shell scripts to automate creation of files. You may just mistype one path and it just all goes haywire.
Try to test that locally You'll you'll thank yourself later. Also, you wanna make sure that you're thinking about security in mind. You don't wanna put a lot of sensitive credentials in these images just in case. Hopefully, there's ways around that in the future via environment variables in the platform you choose to deploy this to. Many of them support that where you don't have to put it in the code, so to speak. And, yes, this is your gateway to Docker, but it does help to have a little bit of background into what's happening in, like, that Docker file, set of images.
How do you get system dependencies inside? Where does our fit in the picture? It does take a bit of getting used to. I won't pretend I have everything figured out. But knowing the way the layer system works, you wanna put the things that are most frequent to change towards the end, not towards the beginning. Because once you change something, everything below that step will have to be recompiled anyway. So if you think about it, if your app is really iterating, you wanna put references about the app code as far down that Docker file as possible, which, again, Shiny, the Docker is gonna help you with that step and then keep the package installation stuff towards the top. That way, that that's pretty stable. You just iterate on the code. It'll be much less time to compile these Docker images than if you had the app stuff at the beginning, and then you're, like, doing all the package installation.
Yes. That can be, it's like watching paint dry sometimes, so you just gotta be prepared for that. In the end, very nice, you know, gateway as a shiny developer to get into containers. In my humble opinion, I think this can also work nicely with Nick's. Maybe I'll talk for the authors of this in the future by having an extension of sorts. With that, with Ricks, who knows? The ideas are out there. But, Mike, you and I are are big fans of containers. What did you think about Vincent's post here? Yeah. We try to containerize
[00:15:57] Mike Thomas:
everything, especially Shiny apps. I really like this post and the introduction into the Shiny two Docker package that seems to utilize, heavily the Docker filer package under the hood, which I have been previously familiar with because that's a dependency of Gollum. Gollum has some functions, I believe, in the dev folder, the o three deploy script for those, Gollum developers out there that allow you to create a Dockerfile for, you know, some of the different, sort of end places where you might be landing your Shiny apps. And it's a one liner that does a lot of this type of thing behind the scenes, and it looks like Shiny to Docker kind of extends this a little bit. Some of the packages I think that really help in this process.
One of my favorites is the attachment package. And I know Shiny2Docker leverages that. And there's this function from attachment called create RN for fraud that Shiny to Docker uses under the hood, which can actually, I believe, sort of ensure that all necessary R packages for your app are accounted for in the Docker image via your RNs dot lock file. And one thing that I will mention is getting, you know, r n's and Docker and everything to to play nicely can be a little tricky, especially when it comes to sort of updating your app and updating packages in your app. But once you get the hang of it, and I think thanks to some of these other helper functions, it can be a huge, lifesaver, especially around dependency management and making sure that what works on your machine works in production.
And I will definitely echo your statement, Eric, about making sure that you test these things as you create a Docker image and and run a container before you just throw it at your CICD process. We've seen a lot of folks who will do that and they'll test things, but then they'll make like one more final little change, to their app before they create the pull request that kicks off the CICD process and they don't retest because they really don't think that's going to actually affect, anything else. But, of course, it it does and that CICD breaks and they you know, everybody has to have a meeting and a conversation around how are we gonna fix this. And if you before you you push things out to that CICD process, if you promise yourself to test it which means that you're gonna run a Docker run command locally, you will cut down on many of those meetings that I have sat in before, to hand off sort of the app locally to production and the DevOps team that's going to actually stand that up. So that's my word of wisdom here.
I know, Eric, that you are a big fan of Nix as a potential replacement for our end. I probably shouldn't even say this, but we build a lot of shiny apps that don't even leverage our end, but use the the r two u, image from Dirk butyl, which just grabs the latest version of all of the r packages that your app depends on. Obviously, the trade off there is is full reproducibility. If you only have a couple dependencies and if you're actively maintaining that app on a day to day basis, then I think it's okay to make the argument to leverage that approach. It's it's just sort of a choice on our end. But if full reproducibility is really important to you, then leveraging our end can can definitely be the way to go. One of the interesting things that I found is that, Shiny to Docker creates this Docker file that typically uses a base Docker image, called Rocker Geospatial.
So it's the geospatial image, from the Rocker project, which comes with R Shiny, and I have to imagine a lot of geospatial levers as well. So I believe that you could sort of customize this after the fact. Right? Because we're just creating this Docker filer object that we can then edit. So it might be a good idea to look into that and determine if that image is the base image for you, especially if you have some constraints around size of Docker images that you're allowed to have and that your DevOps team is sort of expecting you to hand over the wall to them. I know that this this geospatial, image from the Rocker project is already over one and a half gigs by itself.
We tend to create a lot of apps that are actually just around one gig. Again, we we like to use that R2U image, pretty often, which keeps things quite small. So if you do have performance limitations like that, it just might be a consideration. But in a lot of circumstances, that might work perfectly for you as well. I think one of the interesting things that you may not know is is tools like Posit Connect and shinyapps.io. They are doing this behind the scenes, their own person of this. So when you submit your code to either one of those services, they're going to build a a Docker container and they're going to use some sort of workflow that they have. I don't know if it uses Docker filer, if it uses Shiny two Docker, but they're going to try to essentially parse your code, scan scan it and take a look at all the dependencies that you have and what's gonna be needed from an R package dependency standpoint, a system package dependency standpoint, a version of R, and I imagine the operating system standpoint as well. And they're gonna try to do their best job of putting that Docker file together and doing this for you. So I think this is sort of the next step to try to take this in your own hands. I think it can be a really helpful introduction to Docker for those who are not necessarily familiar with it. And as you continue to use this workflow, maybe you'll start to also get comfortable with taking a look at the Dockerfile that gets generated, seeing if there's optimizations or enhancements that you can potentially make to fit what you're trying to do a little bit better than what is trying to be automated behind the scenes. But this is a great gateway project as you mentioned and really excited to see it. Yeah. There when you think about building it locally, sometimes you're not as concerned about these
[00:22:21] Eric Nantz:
other issues you identify, like the size or the base image you're basing off of. Sometimes you just want all the things right away, make your development easier. But, yeah, once you narrow down the hosting platform you're gonna throw this on, a, totally agree. The more you can run this locally and iron out any issues before you throw it over there, the better off your life is, whatever CICD or, like I said, just applying to these cloud based platforms. Because if you can narrow down let's say something happens on the hosting platform and you you're convinced that your app works. You can show your IT support that, guess what, running this on Docker locally, whether on my own machine or what I had to do recently.
Some may know that GitHub has the product called Codespaces where you can, you know, boot up what it in essence is. I containerize environment for for Versus code. I would install my Docker image on there. Pull it down, run the app, and make sure it works there. And that's obviously not the biggest, you know, resource footprint that it has available. So you verified it works in these other situations. And if your hosting platform is still not working, then you've got some stuff to tell IT. It's like, I did my homework. It's working on these things. This literally came from personal experiences or experimenting with Kubernetes stuff at the moment, and I was driving my myself batty because it was working fine in these other instances, but not there. But now we've got some things where you can troubleshoot.
So that's more in the trenches, situation here. I think the big picture is the container technology, great for a lot of reproducibility. It also just gives you a lot of options for where you put this that you don't necessarily have if you just keep the app as is. We didn't even mention there's another platform that you work with a lot, Mike, called ShinyProxy that also is very heavily based in the container footprint, as well. You bring the app as an image there, docker image or a container image, if you will. They have different options even in that orchestration engine. So there's a lot out there. You You give yourself a lot of possibilities. I've actually talked to the positive folks about, well, it's, you know, positive connect.
Cloud is great. If you could give me, like, a way I can bring bring my own container, oh, we're golden, but we're not there
[00:24:42] Mike Thomas:
yet. Yep. No. You're exactly right. Shiny Shiny proxies. Another option where it is a bring your own container option, you know, you have to it's open source. You have to set the environment up yourself and and manage it all yourself. But if you prefer to go that route, then that is a a great possibility in the many choices that we have for shiny deployment these days.
[00:25:19] Eric Nantz:
Well, jeez. We just got finished talking about container technology. We're sharing your shiny ass, but especially you've been following this show along with recent developments. You know there's another challenger as I say for how we distribute shiny apps in a much in a very novel way, and that is of course with WebAssembly. In particular, the terrific work author by George Stagg, with WebR and now with the Shiny Live package where we can take a Shiny app that has, you know, minimal dependencies and throw that into a WebAssembly process. And then our individual browsers become that engine much like how Docker was the engine to run the shiny app in our highlight.
Why am I transitioning the web assembly now? Well, because our next highlight here is indeed a shiny app in the contest of exploring what is actually under the hood of a g g plot two visualization, but it has been deployed as a web assembly app. This has been authored by June Cho who has been a very prolific developer in the visualization space with g g flat two. And, apparently, this idea has been in the works for a while because he's actually talked about the idea of understanding how the different layers in a g g plot come into play as you're construction these visualizations.
He's written papers on this. He's actually given talks such as that the JSM conference as well as our studio comp before that about his his take on the layering approach and a package that he has coauthored in the g g plot two ecosystem called g g trace, which is another novel contribution here to basically look at the internals of g g file two. And in fact, one could say that this g g file two layer explorer app that we're looking at literally right now as we speak is an intelligent GUI front end to the g g trace functionality. So right off the bat, like, if any WebAssembly app, you put it the URL in, say, Microsoft Edge, Google Chrome.
Firefox can be hit or miss in WebAssembly just saying that, but at least for this one, it seems to work fine. Once it loads up, you've got yourself a pretty typical looks like a a scatter plot with a linear regression line best fit. It's a predefined plot as option one, but you got multiple options here such as a bar chart, looks like a a density curve. I'm just clicking through here, a box box and whisker plot as well as a, violin type plot split horizontally. In each case, you can look at the plotting code in the little code editor at the top, and you can make tweaks as you see fit. Like, if you have your own favorite plot, you can just throw it in here if you like and regenerate the code because WebAssembly, again, has that our console kinda baked in. You can just run this code at your leisure and try stuff out. And once you have your plot, you've got really interesting ways ways of selecting the different components of these layers on the left side of the app with this little radio, group button here choice of, say, the compute position, the geom, type of data inside of that. And on the right side, you've got now a way to inspect what is actually in that layer in that particular call to to the g g plot two functionality.
So just like in the case of this violin plot, I can look at what's happening with the geom violin code here, look at the data behind the scenes, look at both the input and the output as well in a nice little data table. And if I want to run some of these expressions that ggTrace is exposing, I can run that on the fly and refresh the the dataset or the summary based on the code I put into that yet another editor. This is fascinating. You can even get information on the layer at a higher level of a model that pops up of the different classes, the subclasses, and the methods that are composing that layer.
This really is I use a cliche looking under the hood. This is as close to, like, really getting into the internals of a car engine, but with g g paul two speakers I've ever seen. And the fact is I can do this all interactively. This is this is amazing. I'm not I can only imagine to be out of work that has gone into building this. But like good documentation practices that Mike and I espouse many times to our projects, there's a very comprehensive about page in this app where you look at at a high level how to use the app, what's the general workflow, and how you can do some common operations in here, like being able to compare the inputs and the outputs, try to run custom expressions to kinda massage that data a bit, and as well as exploring the graphical type of object output for each of these different geomes that you can, you know, you can run here. And there's a little button called hijack plot and literally run what it would look like if you draw this onto the existing visualization.
This is amazing. He does mention there are some, you know, various fine scope that is exploring here with respect to g g proto methods. Again, I don't know as nearly as much as what June knows under the hood of g g plot two, but I do feel like this is, I could bring my own plot to and really go to town with just seeing what are the building blocks of putting this together. A very technical deep dive into the magic that g g spot two gives all of us. So very straightforward streamlined app. I can see myself, spending a lot of time with this in my next visualization adventures. But, Mike, what are your impressions of this?
[00:31:29] Mike Thomas:
Well, one of my impressions this week, I think, is g g plot, I believe, turns
[00:31:34] Eric Nantz:
18 years old. Did it have a birthday this week? Yeah. I kept I kept seeing that or even 21. Like, I don't even know which one's correct now.
[00:31:41] Mike Thomas:
I don't know. I think it was 18 from everything I saw, which would take us back to 02/2007. I think sounds about right.
[00:31:49] Eric Nantz:
Did you ever base plot before g g plot or before you realized what g g plot two was? Oh, yeah. I was a, heavy user of base plot. And then once I knew g g plot two, it's like my life changed forever. If you believe we base plot it. Well, one of us on the call, I'm sure people are not surprised about.
[00:32:08] Mike Thomas:
No. No. I did the same thing. We were starting out with our I think it was because I had a professor that didn't know that our packages existed and anything outside of base are which from my conversations with others in early academia around that time, it it's not unique, unfortunately. But, yeah, this is really, really interesting. You know, the WebAssembly stuff still blows me away. Every once in a while, we'll be trying to do something sort of obscure from a plotting standpoint with g g plot two where we have to go into a particular layer or a g g plot object and try to make a modification, right, to get it to show us this customized thing that we want. And I imagine that some of the wizards out there that do things like tidy Tuesday or is there a competition specifically around database as well or did there used to be one on the r side? Oh, there used to be one. Yeah. That's a good point. I forgot what that was called. But, yeah, we definitely seen it. Pretty incredible, you know, g g plot based visualizations.
I imagine that they would occasionally do the same thing, and it's always felt to me like a little bit of the wild west. Like, you're going into uncharted territory when you start really diving into a g g plot object and then looking at these, particular layer attributes. But this this web application that's been developed really makes it much more tangible and obvious for us to be able to take a look at those internals on a layer by layer basis. And I think it's a really interesting approach that they took to doing so. It's really helpful. I was playing around with the little r code editor to update the plot that they were showing us. I'm I'm looking at predefined plot three, and making some changes to the labels and seeing how it sort of propagates down themselves.
And, I I just find this really interesting. It's a great tool that's been developed for us to be able to do that, and it's a really good education in the g g pot.
[00:34:13] Eric Nantz:
Yep. And, while you were talking, I was doing a little smoothing, and I have found the GitHub repository of of this app. I put it in the show notes for all of you as well. There's some really handy tricks that he's doing from both the Shiny perspective and the WebAssembly perspective. So this is you know, I'm always on the lookout for where people can take the direction of WebAssembly, especially in the case of you might call educational, you know, really getting to know certain packages or certain analytical, you know, methods. But in the case of visualization, oh my goodness. This is just a an excellent use case. Because, again, not only does it come with the predefined plots, the five choices, you can run your own and have it updated in real time. That to me, the magic of WebR knows no boundaries in my opinion.
So I'm gonna be definitely looking at this in more detail. Again, great, great example here. I hope George Stagg is aware of this. I'm sure he would be super thrilled to see where where web assembly is being taken here. But, yeah, credit to judo or judo and the rest of our of his contributors here for, a massive accomplishment here. Alright. For our last highlight today, we're gonna shift gears a little bit, just getting a little more low level with r itself. And those of you who have been around r for a bit or maybe new to it, you may find that one of their critiques is, well, with r being an interpretive language, sometimes it's just a tad slower than, obviously, the lower level solutions based on, say, c plus plus, Rust, whatever have you, especially lately.
But there always have been, you know, attempts to make our code faster such as the aforementioned conversion to c plus plus via RCPP. You know, that's always been a tried and true. And like I said, Rust is getting a lot more attention lately. But an unexpected, repository appeared, in the last week and a half, and this is on GitHub at the moment. This is a brand new effort, but a new r pack is has been authored by Tomas Kalinowski, who I believe is an engineer at Posit. He has authored the quick r package. And what does this really mean? Well, on the tune, it just says it helps make your r code run faster. How is this actually accomplishing it?
Well, you have a function that exposes called quick, and then you feed into this the r code, I. E. A function that you've created that you want to make faster. And apparently with some of these, examples in the readme here, and again kind of simpler arithmetic type of situations with four like, double four loops, we are seeing some pretty substantial gains in the in the reduction I should say reductions in the time to run a quick r version of a of a function going from a to, you know, only four milliseconds. So I guess if you scale that up that could be substantial gains there.
What is it actually doing under hood? From what I can tell in the read me here, it is utilizing Fortran folks. Yes. Fortran, a language you may not have heard about unless you were, you know, like a real old timer like me in grad school and we did have some Fortran happening in our statistics curriculum, it is apparently compiling this r function that you supply into the the quick the quick function into Fortran type of code. Well, there. That's that's, that's an interesting, creation if I dare say so myself. So I am I'm intrigued. Now there are some caveats here that Thomas, outlines here.
This will not work with every type of function because it is dependent on the type of return value that you're gonna use out of this. So there are some caveats to play, but I am I'm definitely intrigued and apparently there is experimental support for using this in an r package. But, yeah, some of these restrictions I had to scroll to see this. You might have you may have to make sure that your arguments are declared explicitly, their types and apparently their shapes as well. Not every type of input supported. It's just integer, double, logical, and complex.
The return value must not be a list, along with some other caveats as well. So as I can see, this is early days and that there is a subset of the current built in r vocabulary that is supported, and he's got a printout of the 66 or so of those under the hood based kind of functions that are supported. But he does a he does plan to expand this out, in the future. So we'll keep an eye on on this. There was a lot of, enthusiasm on social media when I saw this shared out, but, yeah, yet another contender to make your r code run faster with Fortran of all things. So Hello. Again, Eric here. Unfortunately, during our recording, we had a snafu occur with our back end system and our connection for our our awesome co host, Mike. So his his, take on this last highlight was unfortunately lost, but, you know, it's assured he was impressed with QuickR as well. So, anyway, we apologize.
We won't have his audio, but let's go ahead and close out the show. So back to our regularly scheduled podcast. This is gonna be the coolest party ever. Yeah. This is one of those cases where I love to know kind of the genesis of building this was the big motivation here. Now I I am doing a little smoothing on their GitHub repo. I do see a a fellow contributor is the esteemed Charlie Gao who is, of course, the author of Mirai. So my guess is there might be an async thing in play here. I don't know. But I, I I may wanna talk to Tomas sometime to see if he's around, to get get behind the store behind the code a little bit on this. But I'm intrigued and knowing what Charu is cooking up over at Pazza, there's a lot of things this could this could lead to. So fascinating discovery here, which again, you you always learn something new in Aarushi. Right? This is definitely one of those.
With that, that will wrap up our main segments here. I do have to get out of here because the day job calls again, but we definitely thank you so much for joining us. And once again, the rweekly project, if it runs on your contributions, if you have a great highlight or a resource you wanna share with us or a poll request away at r0k.0rg, you can find the link in the top right corner. We have a custom, you know, issue draft or poll request draft, I should say. Fill that out and our Curator of the Week will get that in in the upcoming issue most likely. So with that, I usually talk about our social media handles, but I do have to sked out out of here. You can find that in the show notes. I I write those show notes. A lot of work goes into that, so definitely check those out. And also, we have chapter markers too if you wanna, you know, skip to your favorite highlight. Just hit that little skip button if you're on the modern podcast player. It'll be right there for you. So with that, we'll close-up shop here for episode 206. We're glad that we're able to record this, talk with you all again.
And hopefully, we'll be back with another edition of our weekly highlights next week.
Hello, friends. Yeah. It's been a minute, but we are finally back with another episode. In this case, episode 206 of the Our Wicked Highlights podcast. We have been off for a bit, but this is the show where we talk about the excellent highlights that have been shared as well as some additional things as time permits in this week's Our Weekly Issue. My name is Eric Nance, and hopefully, you didn't forget my voice. I feel like it's been a bit since we last were able to speak to you all and your favorite podcast players. I've had, definitely a whirlwind of a few weeks, and I just would say a note of future self agreeing to do three conferences in a matter of two weeks may not be the best for your, your stability, but I got through it. And the last one was especially fun. But, nonetheless, I am here.
But I am not alone. Thankfully, who's able to carve up at a time today after a bit of chaos is my awesome cohost, Mike Thomas. Mike, how are you doing today? Doing pretty well, Eric. Apologies to the listeners. We'll probably sound a little bit different, today. I'm
[00:01:04] Mike Thomas:
using, unfortunately, my earbud headphones to record instead of my typical microphone, which has officially bit the radish. And I think this is one of the byproducts of having not recorded for a couple weeks as I decided to test out my microphone, about five minutes before we jumped on and realized that it is officially out of commission. So I will be hitting up Amazon very shortly
[00:01:26] Eric Nantz:
and should sound better next week. No. We we know how it goes. And just like with infrastructure, we've been spinning up. Sometimes when it fails, it fails at the most inoperative times, and we ought to put those, put those things out. Well, great for you. It's an Amazon, click away. Some other times, there's a lot of blood, sweat, and tears, so to speak. We're getting everything back set up. I've been down that road too, especially in my Docker adventures literally right before I record here, but I think I got it solved. I anybody that runs Docker on Windows, we're gonna talk about this a little bit. You definitely have my sympathies. Man, WSL is a thing, to deal with. Anyway, I digress. I tend to do that from time to time. So let's look at the notes here. It has been a minute, Mike. We always do our, prep show notes together.
Let's look at the curator here. Well, of course, it would be. Whenever I have so many things going on, usually, that's the time when I have to curate an issue. But thank goodness, I was able to get one out there, and we got a lot of great selections to talk about. So this is a pretty easy one for me to get out to the door, but as always, I could never do it alone. Had tremendous help from our fellow, our rookie team members, and contributors like all of you With in this case, I think it was a whopping five or six poll requests that had some great resources that has been shared in this week's issue. So without further ado, let's get right into it. I already teased it out. Right?
There is a novel technology that's, you know, hit the software development landscape for the last almost like fifteen years it seems like, and that is the world of container technology. The way that you can encapsulate not just a quote unquote application or a library, but its own system dependencies and what you might say is a cross between a full blown virtual machine on your system any more fit for purpose architecture. In this space, the Docker, runtime for containers has been what usually gets the most attention. It's, had the most mindshare, you might say, but it's not the only game in town, of course. There are some other ones like Podman, which has a lot of compatibility with Docker.
And if you wanna get really low level on Linux, you can use what's called LXC for those kind of containers. As an R user, you may be wondering what errors what are some of the best use cases for this? And I still stand by what I'm about to say is that in the context of shiny development, containers are a massive help. It opens up a lot of doors, a lot of possibilities for I can scale this to use those cliches, and also more transparently, actually where you host it. There are a lot of options available to you. You may be new to Docker and you're wondering, oh, gosh. I have to be a Linux expert to get this started as an R user.
Not necessarily. You can grow into it because our highlight is talking about a fantastic package to kinda ease your way into the world of Docker, especially in the context of Shiny development. And it comes to us as the package called Shiny two Docker with the number two in the middle. And we have a great blog post from the team at think r, in particular, Vincent Guider, who I believe is a coauthor of this package to introduce Shiny to Docker and give us the use case of just why it's important and in the mechanics of how it works. The team at ThinkR has been using Docker for quite a bit. We've I've had multiple conversations with the esteemed Colin Faye, also a fellow curator on the art weekly team on his container adventures, and, of course, web assembly adventures on top of that.
But Docker has been a key focus in one of Mike and I's, favorite packages, Golem, to help with those deployment concerns. And I'm gonna I'm gonna say it again because I can't resist. Once you go, Golem, you never go back. Waste of my humble opinion. I'm getting a virtual applause from Mike here, so I'm not too off base here. So they've always had hooks to use Docker, to help generate a Docker file for probably past year or so. But now they've encapsulated that functionality into a standalone package, hence Shiny to Docker. So what does it get for you on on the tin, so to speak?
You get to have what's called the Dockerfile, which in essence is the instructions that you give the Docker runtime to build your, what you might call image or application based image. Shiny the Docker is gonna help automate that process to create it for you. It's also gonna interact nicely with RM for rev. We're out of the box. It's actually if you don't have one already, it's gonna generate an RM lock file and detect what versions of packages you are using on that application library and boost strap that for you to ensure that reproducibility of the package version.
I do have additional thoughts of that towards the end. I'll save it then, but what's also nice is a shiny to Docker in the role of the CICD, especially with GitHub actions, get lab runners and whatnot. It's going to help you create either a GitLab CI configuration or a Git GitHub actions workflow YAML file, which is wonderful to not only build a container on that infrastructure, but also they'll push it to a registry of sorts for these container images. The most prevalent one being the Docker hub as one of those. But you could also choose additional repositories as well.
And that is a huge deal because then with that image in place, you can then choose where you deploy that. So how do you get started? Well, of course, you're gonna install the package with the typical install dot packages because it is on CRAN as a record. So you can get it straight from CRAN or our universe or other places like that. And then the rest of the blog post gives us kind of a case study for preparing a relatively simple shiny application, where it's in a simple app dot r, very much the hello world type geyser, oh, faithful, I should say, histogram that we all know well very well.
And then once you have that encapsulated, it's time to generate the Dockerfile. It's simply a call to shiny to Docker. Give it the path to where your app dot r is or rest of your Shiny files, and then it kinda takes care of the rest in a very use this like fashion. Gonna give you the Dockerfile ready to go. A special text file called dot Docker ignore, which if you're familiar version control with Git, you can have a dot git ignore to tell Git don't touch that file. Environment variable file say hello. Don't ask me how I know. But you can have the same thing with Docker ignore. There may be some things that your pack your application has that just were more for development. You don't already want that in your Docker image because you do wanna keep the size in a reasonable image size because that's gonna, you know, potentially long your download times when you actually call this to run.
And then with throughout the step, it's got a lot of nice kinda air, you know, checking. Like I said, it's bootstrapping RM file of the box for you to help with getting that lock file, and then it's gonna give in that Dockerfile the command to actually run the application using the Shiny run app function, which we typically put, like I said, at the end of these Docker files, these instructions. And then once you have that in place, then you can step back to Docker itself, run the assuming you have Docker on your system, which again is a prerequisite. I do wanna say, in my humble opinion, working with Docker is much easier on Linux than some of the other operating systems, whoever Mac or Windows. But I feel especially, sympathy to the Windows users because you do need the Windows subsystem for Linux version two of that in particular to run Docker on your system.
Trust me when I say that can be finicky depending on your corporate environment. So best of luck with that.
[00:09:49] Mike Thomas:
Mike might have something to say about that. I'd say Casper can help. We've done that a lot. But, yes, you're exactly right. Once you get WSL two installed, I would just highly recommend installing Docker Desktop to automatically install the Docker engine and give you a nice UI into all of your running containers and images as well.
[00:10:08] Eric Nantz:
Right on. It's exactly what we are recommending. Some of you may be aware as a side tangent, but I am part of the submissions working group under Doctor consortium. Literally, as I record tomorrow, I am set to transfer the next version of that pilot with the Docker version of a Shiny app. And, yes, you better believe in my instructions for the reviewers, we have install WSL and then install Docker Desktop. We need the easiest way for them to get containers running. So, yes, as a a very, on the point recommendation. So once you have that in place, you're gonna build the image with docker build. You can give it a what's called a tag, like a little label for that, and it's gonna inspect that docker file and then let the magic run, so to speak. Build it where by where. And then once you have that image in place, then you can use that at any point on your system with the docker run command.
Give it a port that you want your Shiny app to be exposed on. Give it a name if you wish, the name of that image afterwards, and then you're gonna get a process in the terminal to say, okay. Your app is running on local host or, you know, port whatever. Then you can browse to that port. Typically, you can map that to port 80 if you wanna make it super simple, but you can choose any port really at that point. And then you've got yourself a Shiny app. It should work just as if you were running this in your R environment, like in our studio or a positron or or whatnot, it should look just like that. That's the whole idea. It's just another way to execute your application, but your application source code is the same.
So the post concludes with some nice kinda practical tips to give you the most smooth process to get yourself in this container journey. One yeah. As you're iterating, I'm sure you're gonna have, like, cruft in your app directory as you're tripping stuff. Once you're done, like, you've got a stable thing, try to keep that relatively clean. And when necessary, add things or add directory names or file names to the docker. Ignore if they're really not meant to be used by by the container. They do recommend, and I actually I'll hardly agree in this context. RM from the start just outside of this Docker file or shiny to Docker pipeline can be helpful. Although although, I just gave a talk at our medicine about how I think Nyx honestly can be an even more attractive alternative to this. Mike someday will be converted to that. We're not there yet. But I got proven, in multiple projects now.
Next can also fit nicely with Docker as well, where you can have the next help of your development libraries for the app and even at the system for that matter. And Docker, instead of using r and r m to boost wrap that, You can just boost wrap that same next recipe if you will. You'll get that same version of packages. Myself and Bruno Rodriguez will be doing our talks at our medicine. Hopefully, the recordings will be out soon. We think in Shiny, there's a there's a pathway there. So I digress. Other great things to think about, definitely look at things locally before you, you know, rely on CICD to help with a lot of this. Trust me when I say where things can happen in your CICD scripts or it's the YAML instructions that you give it or like me or go hunting for punishment and put some bash shells shell scripts to automate creation of files. You may just mistype one path and it just all goes haywire.
Try to test that locally You'll you'll thank yourself later. Also, you wanna make sure that you're thinking about security in mind. You don't wanna put a lot of sensitive credentials in these images just in case. Hopefully, there's ways around that in the future via environment variables in the platform you choose to deploy this to. Many of them support that where you don't have to put it in the code, so to speak. And, yes, this is your gateway to Docker, but it does help to have a little bit of background into what's happening in, like, that Docker file, set of images.
How do you get system dependencies inside? Where does our fit in the picture? It does take a bit of getting used to. I won't pretend I have everything figured out. But knowing the way the layer system works, you wanna put the things that are most frequent to change towards the end, not towards the beginning. Because once you change something, everything below that step will have to be recompiled anyway. So if you think about it, if your app is really iterating, you wanna put references about the app code as far down that Docker file as possible, which, again, Shiny, the Docker is gonna help you with that step and then keep the package installation stuff towards the top. That way, that that's pretty stable. You just iterate on the code. It'll be much less time to compile these Docker images than if you had the app stuff at the beginning, and then you're, like, doing all the package installation.
Yes. That can be, it's like watching paint dry sometimes, so you just gotta be prepared for that. In the end, very nice, you know, gateway as a shiny developer to get into containers. In my humble opinion, I think this can also work nicely with Nick's. Maybe I'll talk for the authors of this in the future by having an extension of sorts. With that, with Ricks, who knows? The ideas are out there. But, Mike, you and I are are big fans of containers. What did you think about Vincent's post here? Yeah. We try to containerize
[00:15:57] Mike Thomas:
everything, especially Shiny apps. I really like this post and the introduction into the Shiny two Docker package that seems to utilize, heavily the Docker filer package under the hood, which I have been previously familiar with because that's a dependency of Gollum. Gollum has some functions, I believe, in the dev folder, the o three deploy script for those, Gollum developers out there that allow you to create a Dockerfile for, you know, some of the different, sort of end places where you might be landing your Shiny apps. And it's a one liner that does a lot of this type of thing behind the scenes, and it looks like Shiny to Docker kind of extends this a little bit. Some of the packages I think that really help in this process.
One of my favorites is the attachment package. And I know Shiny2Docker leverages that. And there's this function from attachment called create RN for fraud that Shiny to Docker uses under the hood, which can actually, I believe, sort of ensure that all necessary R packages for your app are accounted for in the Docker image via your RNs dot lock file. And one thing that I will mention is getting, you know, r n's and Docker and everything to to play nicely can be a little tricky, especially when it comes to sort of updating your app and updating packages in your app. But once you get the hang of it, and I think thanks to some of these other helper functions, it can be a huge, lifesaver, especially around dependency management and making sure that what works on your machine works in production.
And I will definitely echo your statement, Eric, about making sure that you test these things as you create a Docker image and and run a container before you just throw it at your CICD process. We've seen a lot of folks who will do that and they'll test things, but then they'll make like one more final little change, to their app before they create the pull request that kicks off the CICD process and they don't retest because they really don't think that's going to actually affect, anything else. But, of course, it it does and that CICD breaks and they you know, everybody has to have a meeting and a conversation around how are we gonna fix this. And if you before you you push things out to that CICD process, if you promise yourself to test it which means that you're gonna run a Docker run command locally, you will cut down on many of those meetings that I have sat in before, to hand off sort of the app locally to production and the DevOps team that's going to actually stand that up. So that's my word of wisdom here.
I know, Eric, that you are a big fan of Nix as a potential replacement for our end. I probably shouldn't even say this, but we build a lot of shiny apps that don't even leverage our end, but use the the r two u, image from Dirk butyl, which just grabs the latest version of all of the r packages that your app depends on. Obviously, the trade off there is is full reproducibility. If you only have a couple dependencies and if you're actively maintaining that app on a day to day basis, then I think it's okay to make the argument to leverage that approach. It's it's just sort of a choice on our end. But if full reproducibility is really important to you, then leveraging our end can can definitely be the way to go. One of the interesting things that I found is that, Shiny to Docker creates this Docker file that typically uses a base Docker image, called Rocker Geospatial.
So it's the geospatial image, from the Rocker project, which comes with R Shiny, and I have to imagine a lot of geospatial levers as well. So I believe that you could sort of customize this after the fact. Right? Because we're just creating this Docker filer object that we can then edit. So it might be a good idea to look into that and determine if that image is the base image for you, especially if you have some constraints around size of Docker images that you're allowed to have and that your DevOps team is sort of expecting you to hand over the wall to them. I know that this this geospatial, image from the Rocker project is already over one and a half gigs by itself.
We tend to create a lot of apps that are actually just around one gig. Again, we we like to use that R2U image, pretty often, which keeps things quite small. So if you do have performance limitations like that, it just might be a consideration. But in a lot of circumstances, that might work perfectly for you as well. I think one of the interesting things that you may not know is is tools like Posit Connect and shinyapps.io. They are doing this behind the scenes, their own person of this. So when you submit your code to either one of those services, they're going to build a a Docker container and they're going to use some sort of workflow that they have. I don't know if it uses Docker filer, if it uses Shiny two Docker, but they're going to try to essentially parse your code, scan scan it and take a look at all the dependencies that you have and what's gonna be needed from an R package dependency standpoint, a system package dependency standpoint, a version of R, and I imagine the operating system standpoint as well. And they're gonna try to do their best job of putting that Docker file together and doing this for you. So I think this is sort of the next step to try to take this in your own hands. I think it can be a really helpful introduction to Docker for those who are not necessarily familiar with it. And as you continue to use this workflow, maybe you'll start to also get comfortable with taking a look at the Dockerfile that gets generated, seeing if there's optimizations or enhancements that you can potentially make to fit what you're trying to do a little bit better than what is trying to be automated behind the scenes. But this is a great gateway project as you mentioned and really excited to see it. Yeah. There when you think about building it locally, sometimes you're not as concerned about these
[00:22:21] Eric Nantz:
other issues you identify, like the size or the base image you're basing off of. Sometimes you just want all the things right away, make your development easier. But, yeah, once you narrow down the hosting platform you're gonna throw this on, a, totally agree. The more you can run this locally and iron out any issues before you throw it over there, the better off your life is, whatever CICD or, like I said, just applying to these cloud based platforms. Because if you can narrow down let's say something happens on the hosting platform and you you're convinced that your app works. You can show your IT support that, guess what, running this on Docker locally, whether on my own machine or what I had to do recently.
Some may know that GitHub has the product called Codespaces where you can, you know, boot up what it in essence is. I containerize environment for for Versus code. I would install my Docker image on there. Pull it down, run the app, and make sure it works there. And that's obviously not the biggest, you know, resource footprint that it has available. So you verified it works in these other situations. And if your hosting platform is still not working, then you've got some stuff to tell IT. It's like, I did my homework. It's working on these things. This literally came from personal experiences or experimenting with Kubernetes stuff at the moment, and I was driving my myself batty because it was working fine in these other instances, but not there. But now we've got some things where you can troubleshoot.
So that's more in the trenches, situation here. I think the big picture is the container technology, great for a lot of reproducibility. It also just gives you a lot of options for where you put this that you don't necessarily have if you just keep the app as is. We didn't even mention there's another platform that you work with a lot, Mike, called ShinyProxy that also is very heavily based in the container footprint, as well. You bring the app as an image there, docker image or a container image, if you will. They have different options even in that orchestration engine. So there's a lot out there. You You give yourself a lot of possibilities. I've actually talked to the positive folks about, well, it's, you know, positive connect.
Cloud is great. If you could give me, like, a way I can bring bring my own container, oh, we're golden, but we're not there
[00:24:42] Mike Thomas:
yet. Yep. No. You're exactly right. Shiny Shiny proxies. Another option where it is a bring your own container option, you know, you have to it's open source. You have to set the environment up yourself and and manage it all yourself. But if you prefer to go that route, then that is a a great possibility in the many choices that we have for shiny deployment these days.
[00:25:19] Eric Nantz:
Well, jeez. We just got finished talking about container technology. We're sharing your shiny ass, but especially you've been following this show along with recent developments. You know there's another challenger as I say for how we distribute shiny apps in a much in a very novel way, and that is of course with WebAssembly. In particular, the terrific work author by George Stagg, with WebR and now with the Shiny Live package where we can take a Shiny app that has, you know, minimal dependencies and throw that into a WebAssembly process. And then our individual browsers become that engine much like how Docker was the engine to run the shiny app in our highlight.
Why am I transitioning the web assembly now? Well, because our next highlight here is indeed a shiny app in the contest of exploring what is actually under the hood of a g g plot two visualization, but it has been deployed as a web assembly app. This has been authored by June Cho who has been a very prolific developer in the visualization space with g g flat two. And, apparently, this idea has been in the works for a while because he's actually talked about the idea of understanding how the different layers in a g g plot come into play as you're construction these visualizations.
He's written papers on this. He's actually given talks such as that the JSM conference as well as our studio comp before that about his his take on the layering approach and a package that he has coauthored in the g g plot two ecosystem called g g trace, which is another novel contribution here to basically look at the internals of g g file two. And in fact, one could say that this g g file two layer explorer app that we're looking at literally right now as we speak is an intelligent GUI front end to the g g trace functionality. So right off the bat, like, if any WebAssembly app, you put it the URL in, say, Microsoft Edge, Google Chrome.
Firefox can be hit or miss in WebAssembly just saying that, but at least for this one, it seems to work fine. Once it loads up, you've got yourself a pretty typical looks like a a scatter plot with a linear regression line best fit. It's a predefined plot as option one, but you got multiple options here such as a bar chart, looks like a a density curve. I'm just clicking through here, a box box and whisker plot as well as a, violin type plot split horizontally. In each case, you can look at the plotting code in the little code editor at the top, and you can make tweaks as you see fit. Like, if you have your own favorite plot, you can just throw it in here if you like and regenerate the code because WebAssembly, again, has that our console kinda baked in. You can just run this code at your leisure and try stuff out. And once you have your plot, you've got really interesting ways ways of selecting the different components of these layers on the left side of the app with this little radio, group button here choice of, say, the compute position, the geom, type of data inside of that. And on the right side, you've got now a way to inspect what is actually in that layer in that particular call to to the g g plot two functionality.
So just like in the case of this violin plot, I can look at what's happening with the geom violin code here, look at the data behind the scenes, look at both the input and the output as well in a nice little data table. And if I want to run some of these expressions that ggTrace is exposing, I can run that on the fly and refresh the the dataset or the summary based on the code I put into that yet another editor. This is fascinating. You can even get information on the layer at a higher level of a model that pops up of the different classes, the subclasses, and the methods that are composing that layer.
This really is I use a cliche looking under the hood. This is as close to, like, really getting into the internals of a car engine, but with g g paul two speakers I've ever seen. And the fact is I can do this all interactively. This is this is amazing. I'm not I can only imagine to be out of work that has gone into building this. But like good documentation practices that Mike and I espouse many times to our projects, there's a very comprehensive about page in this app where you look at at a high level how to use the app, what's the general workflow, and how you can do some common operations in here, like being able to compare the inputs and the outputs, try to run custom expressions to kinda massage that data a bit, and as well as exploring the graphical type of object output for each of these different geomes that you can, you know, you can run here. And there's a little button called hijack plot and literally run what it would look like if you draw this onto the existing visualization.
This is amazing. He does mention there are some, you know, various fine scope that is exploring here with respect to g g proto methods. Again, I don't know as nearly as much as what June knows under the hood of g g plot two, but I do feel like this is, I could bring my own plot to and really go to town with just seeing what are the building blocks of putting this together. A very technical deep dive into the magic that g g spot two gives all of us. So very straightforward streamlined app. I can see myself, spending a lot of time with this in my next visualization adventures. But, Mike, what are your impressions of this?
[00:31:29] Mike Thomas:
Well, one of my impressions this week, I think, is g g plot, I believe, turns
[00:31:34] Eric Nantz:
18 years old. Did it have a birthday this week? Yeah. I kept I kept seeing that or even 21. Like, I don't even know which one's correct now.
[00:31:41] Mike Thomas:
I don't know. I think it was 18 from everything I saw, which would take us back to 02/2007. I think sounds about right.
[00:31:49] Eric Nantz:
Did you ever base plot before g g plot or before you realized what g g plot two was? Oh, yeah. I was a, heavy user of base plot. And then once I knew g g plot two, it's like my life changed forever. If you believe we base plot it. Well, one of us on the call, I'm sure people are not surprised about.
[00:32:08] Mike Thomas:
No. No. I did the same thing. We were starting out with our I think it was because I had a professor that didn't know that our packages existed and anything outside of base are which from my conversations with others in early academia around that time, it it's not unique, unfortunately. But, yeah, this is really, really interesting. You know, the WebAssembly stuff still blows me away. Every once in a while, we'll be trying to do something sort of obscure from a plotting standpoint with g g plot two where we have to go into a particular layer or a g g plot object and try to make a modification, right, to get it to show us this customized thing that we want. And I imagine that some of the wizards out there that do things like tidy Tuesday or is there a competition specifically around database as well or did there used to be one on the r side? Oh, there used to be one. Yeah. That's a good point. I forgot what that was called. But, yeah, we definitely seen it. Pretty incredible, you know, g g plot based visualizations.
I imagine that they would occasionally do the same thing, and it's always felt to me like a little bit of the wild west. Like, you're going into uncharted territory when you start really diving into a g g plot object and then looking at these, particular layer attributes. But this this web application that's been developed really makes it much more tangible and obvious for us to be able to take a look at those internals on a layer by layer basis. And I think it's a really interesting approach that they took to doing so. It's really helpful. I was playing around with the little r code editor to update the plot that they were showing us. I'm I'm looking at predefined plot three, and making some changes to the labels and seeing how it sort of propagates down themselves.
And, I I just find this really interesting. It's a great tool that's been developed for us to be able to do that, and it's a really good education in the g g pot.
[00:34:13] Eric Nantz:
Yep. And, while you were talking, I was doing a little smoothing, and I have found the GitHub repository of of this app. I put it in the show notes for all of you as well. There's some really handy tricks that he's doing from both the Shiny perspective and the WebAssembly perspective. So this is you know, I'm always on the lookout for where people can take the direction of WebAssembly, especially in the case of you might call educational, you know, really getting to know certain packages or certain analytical, you know, methods. But in the case of visualization, oh my goodness. This is just a an excellent use case. Because, again, not only does it come with the predefined plots, the five choices, you can run your own and have it updated in real time. That to me, the magic of WebR knows no boundaries in my opinion.
So I'm gonna be definitely looking at this in more detail. Again, great, great example here. I hope George Stagg is aware of this. I'm sure he would be super thrilled to see where where web assembly is being taken here. But, yeah, credit to judo or judo and the rest of our of his contributors here for, a massive accomplishment here. Alright. For our last highlight today, we're gonna shift gears a little bit, just getting a little more low level with r itself. And those of you who have been around r for a bit or maybe new to it, you may find that one of their critiques is, well, with r being an interpretive language, sometimes it's just a tad slower than, obviously, the lower level solutions based on, say, c plus plus, Rust, whatever have you, especially lately.
But there always have been, you know, attempts to make our code faster such as the aforementioned conversion to c plus plus via RCPP. You know, that's always been a tried and true. And like I said, Rust is getting a lot more attention lately. But an unexpected, repository appeared, in the last week and a half, and this is on GitHub at the moment. This is a brand new effort, but a new r pack is has been authored by Tomas Kalinowski, who I believe is an engineer at Posit. He has authored the quick r package. And what does this really mean? Well, on the tune, it just says it helps make your r code run faster. How is this actually accomplishing it?
Well, you have a function that exposes called quick, and then you feed into this the r code, I. E. A function that you've created that you want to make faster. And apparently with some of these, examples in the readme here, and again kind of simpler arithmetic type of situations with four like, double four loops, we are seeing some pretty substantial gains in the in the reduction I should say reductions in the time to run a quick r version of a of a function going from a to, you know, only four milliseconds. So I guess if you scale that up that could be substantial gains there.
What is it actually doing under hood? From what I can tell in the read me here, it is utilizing Fortran folks. Yes. Fortran, a language you may not have heard about unless you were, you know, like a real old timer like me in grad school and we did have some Fortran happening in our statistics curriculum, it is apparently compiling this r function that you supply into the the quick the quick function into Fortran type of code. Well, there. That's that's, that's an interesting, creation if I dare say so myself. So I am I'm intrigued. Now there are some caveats here that Thomas, outlines here.
This will not work with every type of function because it is dependent on the type of return value that you're gonna use out of this. So there are some caveats to play, but I am I'm definitely intrigued and apparently there is experimental support for using this in an r package. But, yeah, some of these restrictions I had to scroll to see this. You might have you may have to make sure that your arguments are declared explicitly, their types and apparently their shapes as well. Not every type of input supported. It's just integer, double, logical, and complex.
The return value must not be a list, along with some other caveats as well. So as I can see, this is early days and that there is a subset of the current built in r vocabulary that is supported, and he's got a printout of the 66 or so of those under the hood based kind of functions that are supported. But he does a he does plan to expand this out, in the future. So we'll keep an eye on on this. There was a lot of, enthusiasm on social media when I saw this shared out, but, yeah, yet another contender to make your r code run faster with Fortran of all things. So Hello. Again, Eric here. Unfortunately, during our recording, we had a snafu occur with our back end system and our connection for our our awesome co host, Mike. So his his, take on this last highlight was unfortunately lost, but, you know, it's assured he was impressed with QuickR as well. So, anyway, we apologize.
We won't have his audio, but let's go ahead and close out the show. So back to our regularly scheduled podcast. This is gonna be the coolest party ever. Yeah. This is one of those cases where I love to know kind of the genesis of building this was the big motivation here. Now I I am doing a little smoothing on their GitHub repo. I do see a a fellow contributor is the esteemed Charlie Gao who is, of course, the author of Mirai. So my guess is there might be an async thing in play here. I don't know. But I, I I may wanna talk to Tomas sometime to see if he's around, to get get behind the store behind the code a little bit on this. But I'm intrigued and knowing what Charu is cooking up over at Pazza, there's a lot of things this could this could lead to. So fascinating discovery here, which again, you you always learn something new in Aarushi. Right? This is definitely one of those.
With that, that will wrap up our main segments here. I do have to get out of here because the day job calls again, but we definitely thank you so much for joining us. And once again, the rweekly project, if it runs on your contributions, if you have a great highlight or a resource you wanna share with us or a poll request away at r0k.0rg, you can find the link in the top right corner. We have a custom, you know, issue draft or poll request draft, I should say. Fill that out and our Curator of the Week will get that in in the upcoming issue most likely. So with that, I usually talk about our social media handles, but I do have to sked out out of here. You can find that in the show notes. I I write those show notes. A lot of work goes into that, so definitely check those out. And also, we have chapter markers too if you wanna, you know, skip to your favorite highlight. Just hit that little skip button if you're on the modern podcast player. It'll be right there for you. So with that, we'll close-up shop here for episode 206. We're glad that we're able to record this, talk with you all again.
And hopefully, we'll be back with another edition of our weekly highlights next week.
Whoops!
Episode Wrapup