A collection of tips for spreading the good word about your awesome R package, how spring cleaning a package codebase doesn't have to be a dreadful experience thanks to usethis, and the culmination of a learning journey to bootstrap node JS projects powered by webR.
Episode Links
- This week's curator: Colin Fay - [@_ColinFay]](https://twitter.com/_ColinFay) (Twitter)
- Marketing Ideas For Your Package
- Spring clean your R packages
- webrcli & spidyr: A starter pack for building NodeJS projects with webR inside
- Entire issue available at rweekly.org/2024-W11
Supplement Resources
- rOpenSci software review process: Aims and scope https://devguide.ropensci.org/softwarereview_policies.html#aims-and-scope
- Colin Fay's hexmake Shiny app https://github.com/ColinFay/hexmake
- No installation required: How WebAssembly is changing scientific computing https://www.nature.com/articles/d41586-024-00725-1
- tryr - Client/Server Error Handling for HTTP APIs https://github.com/analythium/tryr
Supporting the show
- Use the contact page at https://rweekly.fireside.fm/contact to send us your feedback
- R-Weekly Highlights on the Podcastindex.org - You can send a boost into the show directly in the Podcast Index. First, top-up with Alby, and then head over to the R-Weekly Highlights podcast entry on the index.
- A new way to think about value: https://value4value.info
- Get in touch with us on social media
- Eric Nantz: @theRcast (Twitter) and @[email protected] (Mastodon)
- Mike Thomas: @mike_ketchbrook (Twitter) and @[email protected] (Mastodon)
Music credits powered by OCRemix
- Vivid Orbis - Marble Madness - Gaspode - https://ocremix.org/remix/OCR04555
- Black Genesis (Floating Continent) - Final Fantasy VI Balance & Ruin - Brandon Stradery, Rexy - https://ocremix.org/remix/OCR02796
[00:00:03]
Eric Nantz:
Hello, friends. We are back at episode 156 of the R Weekly Highlights podcast. If you're new to the show, this is the weekly podcast where we talk about the latest highlights that have been featured on this week's our weekly issue. My name is Eric Nantz, and I'm delighted you joined us from wherever you are around the world. And, you know, I never do this alone. I have my awesome cohost join right here, my line mate, partner in crime here, Mike Thomas. Mike, how are you doing today?
[00:00:30] Mike Thomas:
I'm doing well, Eric. A little better than the the Red Wings, though. It seems like they've been skidding in the last 3. Come on, Red Wings. Let's let's pick it up here.
[00:00:39] Eric Nantz:
There's been a a bit of anger, and there's speculation that I think you're familiar with this and those that follow sports in general in the US are familiar with this. There's more advertising now on players' jerseys. They literally just put a patch for, of all things, a trash company on the Red Wings jersey. Oh, that's bad. They are winless since then. Now I'm not one of those people who's gonna say this is exactly a correlated event, but I'm just saying, couldn't they have waited till next year? I'm just saying. So
[00:01:14] Mike Thomas:
Yeah. We'll see, Mike. Yeah. Well, it's certainly correlated, but we'll hope it's not causal.
[00:01:19] Eric Nantz:
Exactly. Yes. Thank you for cleaning that up. Ironically, talking about trash coming and cleaning it up. Well, luckily, you know it's not trash here. What we're talking about here today, we don't have to worry about losing streaks here. We're on a a hot streak of awesome highlights this year for sure. And our curator this week is the esteemed Colin Faye, of course, the architect of all things GOLM and many of our shiny and web in general technology tools, which we'll be talking about later in this episode. But as always, he had tremendous help from our ROK team members and contributors like all of you around the world with your awesome pull requests, suggestions and general feedback.
So let's get right to this. Right? And one of the kind of rites of passage, you might say, as you develop your R skills and your journey into data science, do you have that great idea for maybe that new analysis technique, maybe that new data source that you want to make as easy as possible for yourself and potentially others to bring into R and do some cool analysis with? Well, that, of course, is running a package. Right? This used to sound so intimidating, but with the frameworks that we've been featuring heavily on our weekly highlights since the life of this show and the life of our weekly in general, there are lots of amazing tools in place that get you started right on that journey to create your package.
And let's say you've used those tools. You've got an awesome package ready to go. But you might ask yourself, now what? How do we exactly get this in the hands of our users? And our first highlight is coming from, once again, the very awesome rOpenSci blog authored by Ioannini Balenis Salbin and Ma'al Salmon returning once again with their series that is inspired by their recent workshops with our open side champions about how you can promote and release your r package to the world. And so like I said, Mike, first step is just getting it out there at all. What kind of advice do you have for us here?
[00:03:20] Mike Thomas:
Yeah. Well, I I mean, I'll even set the tone before that. Creating an r package to me is just such a great idea to try at some point because it sort of forces you to use a lot of great best practices around writing good software around the the science that underlies what you're trying to do. So I would highly recommend taking a stab at creating a package if you have never tried to do so before, I think you'll find it a pretty rewarding experience and I think it'll make you a better better programmer. But once you have created that package, you're exactly right, Eric. How can we how can we market it? And and one of the first steps, which I wholeheartedly agree with that Yanina and Mael recommend is to create a great read me.
I can't stress this enough, you know, your read me will help others, understand exactly what it is your package does, how to install it, and maybe a few different examples of how to get started using it. It may even, as you create that Readme, help you sort of refine your idea around what you actually wanted this package to do and may cause some changes to your actual functionality not speaking from experience or anything like that but it's it's one of those, exercises kinda like a rubber duck I think that forces you to explain exactly, you know, in in layman's terms, non code terms, what your package is trying to accomplish. And and it's a a great idea.
Spend all the time creating, you know, the coolest hex logo that you possibly can as well. Not that I've wasted hours doing that at the end of a project, but that's a super fun part of it as well and I think that from a marketing perspective any sort of visual fun aids can help market your your package as well if you use GitHub you Nina and Mel recommend that you pin that package repository to your profile so that as soon as somebody visits your GitHub that'll be sort of the first thing that they see that stands out and that's a great idea as well. The next, the next recommendation that they have around publishing is one that I need to take to heart because I have not done this yet and it is create a universe on our universe, which as we've talked about on the podcast before, is this absolutely incredible resource created by your own. And, it is a phenomenal place to to host your packages that automatically, I think, displays the documentation and metadata around your package in just a beautiful really accessible way.
And also, I think allows others to install it it very quickly because I think in in some cases it'll build binaries if I have that correct. Yes. It is building binaries. Yep. Okay. Which can make the installation experience a lot better for your users and then you know after that sort of I guess the the holy grail right would be potentially public publishing it to crayon which is sort of the final way to make installing your package probably the easiest to the largest array of users, across different experience levels out there.
And one thing that that I also hadn't thought of it as well, and if you're familiar with the rOpenSci project, is that they have a peer review process which is a phenomenal thing that is in place to sort of ensure robustness and rigor around your R package and ensure, again, that you're using, some of the best practices around software development and creating an R package. It's not something that I've done before. I I think that it's a fantastic fantastic resource and it's something that I wanna take advantage of, to me, and I think maybe this blog is debunking that that myth a little bit. I was never sure that my packages were sciency enough, for our OpenSci to necessarily, consider, you know, peer reviewing the work that I've done. But, as I know that they have office hours and and things like that that are publicly available, I would definitely recommend that folks take advantage of these resources that they offer. And in that peer review process, I think, as they mentioned, they may catch a lot of things, that you might get flagged on when submitting to CRAN. So they may help you expedite that process of getting your package onto CRAN.
So those are their recommendations around publishing. And then maybe, Eric, you can talk a little bit about their recommendations around promoting your package.
[00:07:46] Eric Nantz:
Yes. And this is a a skill set that is admittedly sometimes not intuitive to many of us, especially as we're new to this this, situation of getting the package out there but trying to get it into the hands of the user base that we intended to have, especially as we wanna garner initial feedback and, frankly, make make a positive difference. Right? And so there are lots of interesting ways. And, again, maybe not a one size fits all for everybody, but I think the advice here in general is quite sound. One of those is taking advantage of the lowest friction to get this out there on various either social media or other publishing platforms.
One of them, of course, I'm not gonna be ashamed to say I'm biased with this, is, hey. Send it to us at our weekly. Right? We have a section every single week on new packages that have been released to the R ecosystem, whether they're on CRAN or on GitHub only or or a GitHub like repository. And we always also link to the R Universe project in every issue as well. So that is, again, we are, as I say at the end of the show all the time, a poll request away from making that announcement on our weekly. We definitely recommend you take advantage of that.
And, also, if you wanna spread the word on kind of your intention of the package and maybe some more up to date notes from yourself to your audience, another great way is to start your own blog. Right? There have been plenty of frameworks now in the R community that will make creating a blog with markdown super easy. I, of course, speak with great success with the blog down package. Now Quarto, of course, that we can talk about routinely on the highlights has its own mechanism for creating a website with a blog component. So those would be another terrific way that you could spread the word about your package and then posting that on various social media channels such as Mastodon, LinkedIn, some of the others out there.
And those, again, are great ways to get the word out. Again, not natural for yours truly to do all this, but over time, you really start to see some really great nuggets come out in the community as you follow these feeds, and you could definitely put your package as one of those items in those feeds. And then as you as you said, Mike, going through the rOpenSci process from the creation and peer review process of your package is a terrific way to enhance the quality. I personally have not done it, but I've lived vicariously through my esteemed friend Will Landau and his peer review process for targets, which is you know, why we recognize in the R community for innovations, to say the very least. And, yeah, rOpenSci has done a tremendous job with targets.
And one of the other bits of advice that are in this post here is that if it is on rOpenSci, they have additional channels to market your package, such as featured tech notes that are going on their our open side blog, which is what we're reading through right here as we speak. And also, as you mentioned, they also have community calls and even can set up dedicated working sessions where maybe you, as a package author, wanna give a chance for, you know, a prospective user to hop on a video working session with you. And you can talk about the package and maybe debunk some issues, But it's another great way to get the word out because those are very, you know, relaxed atmosphere, just, you know, practical discussions, and another excellent way for your users to learn about, the way your package works if you're in the rOpenSci ecosystem here. And, of course, social media and blog posts are just one way to get the word out. Another terrific way, especially in the community of the R community, the worldwide presence of these user groups and also the R Ladies user groups.
Another terrific way to maybe have a short presentation or a short working session, maybe an online workshop of 1 of these working groups. Another terrific way to get the word out about your package. I've seen some really great great showcases of that throughout the years on these various online forums now, especially since the pandemic. Many of these are remote. They're sharing recordings on YouTube or other video channels. Another terrific way to get the word out. And I dare say another fun way, if you're really adventurous, to do a little livestream once in a while like I used to do in the past, which I hope they get through someday. But, again, there are many different avenues for you to get your get your package out there. And certainly, if it is a very scientifically focused package, there are some very well renowned manuscripts out there such as getting it into the R Journal itself or the Journal of Statistical Software, many others that are domain specific as well, which, of course, in my field, we do a lot of literature review. We're seeing a lot of new algorithms being published, and they often have an accompanying R package to go with that publication.
Another very traditional yet very powerful mechanism for getting the word out there. And one little bit, Mike, as you said, you were really sure if a package you're creating is, quote unquote, scientific enough for an rOpenSci, you know, scope. But what we'll what we'll link to in the show notes is, a section on their online peer review book where they do talk about the intended scope that they look for with respect to bringing a package on board to rOpenSci. And the great news is that doesn't have to be a very focused domain specific scientific algorithm or method. They have many packages that are involved in making data more accessible, making APIs more accessible. There are lots of interesting domains here that, you know, could be a good fit. Again, may not be for everybody, but, again, if it does fit in that scope, you might benefit greatly by going on our open side.
And certainly, I'll also speak on the perspective of those in an industry where maybe you don't get a chance to publicize this to the worldwide, our community, until you get the, quote, unquote, blessing of getting an open source. Maybe you have to deal with this internally at your organization. If you have a large organization, how do you make sure that your user base within the company get their eyes on it? I'll go back to what I said maybe a few minutes ago. Having an internal blog is a cool thing to do too. I'm actually trying this out now as I speak, doing a little portal blog for our internal group to share package announcements that our group is creating and having those broadcasts on either some newsletter or some other, you know, distribution service within the company. So it's not just us making these cool packages out there and then, you know, others in statistics or data sciences not not getting the word of it. We're gonna find ways to get that message across. So I think a lot of these principles can apply to those of you that I'm gonna steal a phrase from my friend Michael Dominic with the Coda Radio podcast.
You dark matter developers out there, I see you. I know you're out there. There are great ways you can take some of these techniques internally at your respective organizations, too. So really, really great blog post. Gives you a lot of great ideas to follow-up on. And again, I think just getting the package out there, you've done immense work to do that. It's a journey. I know how it goes but getting the word out there so that your user base can get their hands on this, is, you know, a really critical component to making sure that you can make the package even better as your users get their hands on it. So, yeah, really great advice here, and I highly recommend what we're talking about here. Yeah. I know. And that's a great point, Eric, too, because not all of us can share
[00:15:41] Mike Thomas:
our our packages with, you know, the the general public in the outside world, but I I think you absolutely can take these principles and leverage them within the communities inside your own organization, through whatever means, you know, necessary that you have available to to do that, but but still leveraging these principles I think is a great idea.
[00:16:10] Eric Nantz:
Now what if, Mike, you're in the situation of you've you built that package, but maybe it was, I don't know, 5, 6, maybe even 10 years ago. You look at the code base and you realize, oh, past me did that. If past me knew what present me knows now, I probably wouldn't have done it that way. You might be in the situation where your package may deserve a bit of what we'll call spring cleaning, as they say. And so our next highlight, comes to us from the Jumpy Rivers blog. We featured them quite heavily on highlights in the past, authored by Rhian Davies, and is appropriately entitled Spring Cleaning Your our packages. And they start off with relating to that Jumpy Rivers themselves have put many packages on, say, GitHub or CRAN and whatnot, and some of them develop it more than, say, 5 years ago.
And then as we learn I mean, I'm a continuous learner, as they say. Lots of new, you know, best practices, maybe modifications to existing best practices. And then you realize, yeah, you know what? I should try some of that in my legacy package that needs a refresh. How do we go about that efficiently? There are some practical things you can start with, one of which is if your package is on a GitHub or GitHub like repository. There was about 3 or 4 years ago, Mike, there was a movement to change the nomenclature of default branches.
We won't get into all the details here, but the connotation of master didn't exactly sit well with many people in today's, you know, communities. So there's been a movement to change that to a more friendly term such as Maine or something like that. And so there is an easy way to rename your branch right away. And we're going to talk about this heavily on this segment. They use this package author by Posit is the superstar here, so to speak. We're getting these tips operated on efficiently with as less friction as possible. So there is a handy function called get default branch rename.
It's a long function name, but it literally says what's on the tin, what it does. So once you do that, your branch becomes main. You can push that up to GitHub, and you are all set. But, of course, it doesn't end there. We've got we might have some additional things that we wanna tidy up with respect to the package metadata. This was new to me, Mike. I'm curious if you knew about this before, but there were times I wrote my description file for a package, frankly, by hand back in the old days. Yep. You remember. And so I'm updating a package that, again, I've lost there, like, 8, 9 years ago.
And there is this little gem that's in this blog post Now if you wanna tidy that up so that things like you have fields in more alphabetical order, maybe spacing correct, making sure that consistent names with maintainers and and author fields are correct, There is a use tidy description function that will basically put everything in the standard order, alphabetize the dependencies, and make sure everything just looks really tidy. So if you happen to do all that manually, that is awesome. I love seeing that. And then another part is we don't always want to do things ourselves all the time, right?
We're in 2024. There is a new technology in certain platforms such as GitHub to automate many of these checks in action. And, Mike, there is even more use this magic for getting that set up for you, isn't there?
[00:20:00] Mike Thomas:
Yes. There are. And as we talk about things that we used to do that we no longer do. Right? Travis CI, I think used to be that the most popular tool for continuous integration, which is, you know, running code essentially, on maybe some separate server. And a lot of times this was around running tests, ensuring that all your unit tests pass, when you create a pull request before that pull request actually gets merged into the main branch. Nowadays, we're using continuous integration quite a bit as well for, like, creating package down sites, and updating, you know, what's shown in that that, package down branch of your repository that spins up the the whole package down site that folks can go to and see the beautiful version of your your packages documentation. So, Travis c, I sort of used to be the only game in town, but it's GitHub actions now.
You know, I think I think that's probably the primary way that folks are going these days. And fortunately, again, use this package, allows you to easily create, that continuous integration GitHub action with, the function. Stop me, if you're not expecting this, but the function name is use GitHub action. And you can, supply sort of what type of check you want to create and that essentially creates a whole entire YAML file that will execute, the unit tests in your package, run those tests, when a PR takes place, I think, for the most part. And you you can alter that YAML file if you want to, you know, change when, those checks get fired, or or other certain specifications of those unit tests, getting run-in this continuous integration, situation.
So there's a lot of different options here, within, you know, being able to use GitHub actions depending on sort of how strictly you want to run the tests on your package, and then, you may also want to take a look at it and see how much test coverage quote unquote, your package has, which is sort of the ratio, I believe, of the number of lines of unit testing code versus the number of lines of of sort of our code that your functions themselves have, and and you wanna sort of ideally be as as close to a 100%, I think, as possible. I think there's a lot of a lot of opinions out there on that that we don't necessarily need to get into. But it's probably a good idea to show your users sort of in general at a high level that you are are writing a lot of tests around the functionality for your package to ensure that, you know, your logic is doing what you expect it to do, and that it continues to satisfy those expectations as you you make changes and and refactor over time.
And the last two things that, they recommend are 1, creating a hex sticker, which is a callback to our our first blog as well, which I will highly recommend. I use a site called Canva, which I think I pay a couple bucks a month for, but, it is it's pretty incredible just the the stuff that you can do. And nowadays, it seems like everybody's leveraging, you know, these generative a ai models to, sort of write a prompt of what you want shown on your HEX logo and then, you know, that'll spit out a wild image for you that that you can crop to, a HEX background. In my case, I I do that pretty easily with Canva. And, we've been I've spent way too much time on that recently but it's it's super fun and can be a fun way for folks to, it can be the first thing that they see when they navigate to your package down site or, you know, browse the vignettes within in your package, in the RStudio IDE.
And I think it can can sort of create some excitement and engagement around your package. And then the last thing, that they will recommend is, contributing and code of conduct, adding that to your repository in your package as well to let users know the best way, and sort of the guidelines and principles that you expect people, that want to contribute to your package, to contribute to your package in in a friendly way, in a safe environment that, works sort of for everybody that that's working on that that repository. So some excellent excellent spring cleaning, if you will, ideas and and examples from Jumping River. Spring is here, and I think it's it's time for all of us to start diving into these.
[00:24:27] Eric Nantz:
Yeah. I've literally been living this life for, like, 3 weeks now with this legacy package and seen so many areas that need a little attention, a little package and seen so many areas that need a little attention, a little cleanup here and there. And so all these principles either have or will take action quite a bit. And back to HEC stickers. Yes. Yours truly did revise a HEC sticker for this legacy package. I'm gonna give a quick plug. If you want, in the in the blog post here, they're referencing the hex sticker or, I believe, the sticker package or hex sticker package, easy for me to say, as in our way of doing it. And then, also, if you wanna stick with VAR about bringing all shiny in it, many years ago, Colin Faye, as part of a shiny contest submission, released the Hex Make Shiny app where you can literally create a hex sticker, superimpose an image on top all within a Shiny app and download it. So I actually use that literally to make a new hex sticker for my internal package.
That was that was a lot of fun. So I'm never shy to plug that that fun shiny app for my bookmarks as well. And, honestly, yeah, back to the contributing guidelines, when I would build these legacy packages, you know, back then, maybe I was naive. It always seemed like it would be just me, so I didn't put a lot of thought into it. But you know what? The these packages, again, I think can really thrive when you have somebody at least that wants to be active with you, if not maybe developing day to day, but at least, you know, helping you test things out. Maybe they're a liaison to other users, and then they have feedback, and then they can, you know, find the best way to, you know, help you with that feedback. But you wanna give them the easiest way to get started with that. So these contributing guides, whether your package is open source or within the confines of your industry firewall, I think those are critically important to make sure that you give these others that maybe are willing to step in this. They'll know where to start. Things like a contributing guide, also making good use of issue labeling in your whatever your system is for issue ticketing, things like good first issue or help wanted or, you know, you know, things like that. You know? And, obviously, it's project specific, of course, but making it as easy as possible for people to really drill down to see which areas they can contribute to the most. So, again, great things to think about as you're already in the midst of making your pack as a little more tidy along the way. Yeah. Really good advice here.
Well, I teased this earlier, Mike, but, our curator here has been hard at work not just curating this issue, but, Colin has been knee deep in this learning journey and this saga of supercharging his workflows with WebAssembly. And in fact, we are gonna be talking about in the last segment here, it is, I believe, the 6 posts in his series of exploring WebR and WebAssembly with respect to, you know, interactive web applications. And what we're going to talk about here, what seems to be kind of a culmination of everything he's been learning here, is the idea of having new tools available that within, say, the native JavaScript world of bringing WebR, WebAssembly powered by R into these, these applications via 2 new utilities that work in tandem. So let's dive right into this where earlier in his explorations, he's been prototyping some interesting use cases of, say, converting an existing Shiny app in the WebR, preloading packages in an Express JS API, you know, bringing your own functions in WebR and then building them into the the the Node. Js app and whatnot.
Well, he realizes that a lot of that was, you know, kind of, you know, piecemeal learning a bit here and there ad hoc. What if we wanna take those best those practices that he's been outlined into a very easy way to make it happen that might have some parallels to what we get in the R community when we build packages with, say, dev tools, use this. And in the Shiny situation, of course, what Collins authored with the golem package. What's a way to bring that all together in this native kind of WebAssembly and JavaScript world? So in this blog post, he announces 2 new utilities to make this happen, one of which is called WebRCLI, which again is going to be very similar to kind of a dev tools use this paradigm along with other functionality we'll get to later that's going to help you create a no JavaScript project, but with the bindings to WebR already baked inside, things that he was building manually in the earlier stages of this journey.
This package or this utility is going to bootstrap that for you, not too unlike what you would do with use this and say use this use package or create package. I forgot the exact name of it, but it's where it gives you the scaffolding of an R package right away. And then it's up to you to fill in the blanks, if you will. This is doing a very similar framework with, again, the WebAssembly piece of all this. And so that in tandem with the other utility, which is called Spider, which is looking like a way to build extensional functionality on top of WebR itself, such as what we get in typical R installations where we want a package from CRAN or or maybe even from GitHub with the remote package.
We have functions to literally install that package. Right? Install that packages or remote install GitHub or whatnot. Spyder is giving you a utility to use a native JavaScript function that will look very similar to those installation commands, and it's giving you those built on top of WebR to bring those packages down to your local project. This, this is kind of amazing to me. It's not just taking their installation. It is putting them in a project specific directory that, if you're familiar with r m, will look very similar. I went through the GitHub example that we'll have linked to in the show notes.
He ignores this, but I and his GitHub ignore. But what I did is I cloned this locally to give it a try. And sure enough, there is a directory that when you go inside it, it will it's called webrpackages. You go in there, It will look very similar to your r mv library where you download packages. This is fascinating to me. Colin has figured out how to load these packages from a file store into these WebAssembly powered applications. This is massive to me because where I'm going with this, ongoing pilot submission with WebAssembly, we want to explore ways of not just grabbing packages from the WebR binary repository on the fly, so to speak.
But should we want to distribute packages as part of a bundle? How do we bring those into the application locally? So I will be looking into this quite closely to see if I can take some nuggets from this, whether it's for this particular pilot or for future explorations the CI can mirror this with things like Shiny Live, that we're using right now in our pilot submission. So the wheels are turning after I read Collins post here. But this is again, this is all fascinating to me. So one thing you notice that we didn't mention here is that we're not talking about Shiny here. Right? He is speaking on behalf of those that maybe are familiar with JavaScript native ways of building a web application, but you have a function in R or a package in R you wanna leverage as the back end to that Node. Js or other JavaScript like app, this set of utilities is your way to make that happen.
And I definitely invite you, if you do have, you know, Node. Js and NPM installed on your machine, give this a shot. I literally ran through the blog post this morning, and everything worked to a tee. Everything worked as advertised. So this, I think, is opening a lot of possibilities here. But as we often say in these explorations, it's early days. He has not tested this over than a few examples, and he is very eager to get community feedback on how this goes for those also that are willing to explore this kind of blazing trail, if you will, of this of this new journey here.
So there's notes at the end about how he kinda pulled this off from, like, a back end perspective. But again, he's looking for feedback on this, and I definitely am intrigued by what I'm seeing here. And I can't wait to to learn more about how this works under the hood. But this is a great time to talk about WebAssembly right now because I'm thrilled to say as of yesterday when we record this episode, there is a fascinating new article released by Nature authored by Jeffrey Pirkel. It's entitled no installation required, how web assembly is changing scientific computing. And I'm humbled to say to yours truly has a small little quote in here based on our current explorations, but I will say this is the kind of stuff I am super excited about. We are trying to push the envelope here.
We think there is a massive potential in many industries for this. Of course, I'm coming from life sciences, but there are many, many others that I think can make heavy use of WebAssembly. This article has terrific narrative around kind of the genesis of this from George Stagg himself as he started prototyping WebR along with other members of the scientific community and how they're showcasing the use of this technology. So great time for me to see this. And, again, super excited to dive into what Colin's exploring here and see how we can supercharge this in the future.
[00:34:47] Mike Thomas:
Yes. Me as well. And, Eric, I'm glad that you you shouted out that article because if you didn't, I was going to you know, when I first started doing this podcast, I was starstruck that I got to record with the, you know, the the host of the r podcast and the Shiny Dev series. Then I think I've gotten a little comfortable, with you, but but now you are featured in Nature Magazine and I'm right back where I started. So hats off hats off to you that that that is an awesome awesome accolade and it's it's super exciting as well to see that, you know, the scientific community is is talking about this stuff as well. And it's not just us software nerds, you know, that that are the only ones caring about this, that it's it's really something that other folks seem to be seeing as well as a pretty revolutionary thing that's starting to come into the ecosystem. And and fortunately, we do have folks like Colin who are at the the cutting edge of, the this WebAssembly stuff. You know, Colin curated this week, and and when I saw the blog post, I thought this
[00:35:55] Eric Nantz:
was a little bit of insider trading, but I'm very glad that, I'm very glad that this one made the highlights. It's a great example.
[00:35:57] Mike Thomas:
You know, one of the the the toy examples here is is, called this WebR SpongeBob, example that he has, which I think just sort of allows you to, you know, essentially change some text, a string that you write to what's called sponge case. A quick story, during a particularly slow period a couple years ago, for me, I highly considered creating this exact r package. I didn't end up doing it, and and glad I didn't because it looks like maybe Colin was the one who created the spongebob package. I'm not not sure if it was him or somebody else, but, somebody somebody took care of it for all of us. Obviously, that's a very important package in the art community. So it's nice nice to have that one out there. But this is a phenomenal guy. To use your own. Right? To use your own. Yeah. And like you said, you know, it's incredible. I haven't actually tried it myself, but it sounds like you maybe forked the repository as well and ran through this and found that there were no issues. Obviously, I think Colin in both the the read me's in these repositories and this blog post as well makes a lot of disclaimers that, this is very early on, very experimental.
You know, expect a lot of bugs. He, I think may already be seeing some bugs and edge cases that he's hoping to to solve, but regardless, I think the the fundamental concepts here of what's being done are are really driving sort of this this idea in this space forward about, you know, that this this web assembly topic and not needing to manage dependencies, in ways that traditionally were were a little difficult and sort of making that much easier and much more accessible to a wider variety of people, which is is incredible. So I'm excited to see, how this this continues and patiently waiting on on blog post 7.
[00:37:45] Eric Nantz:
Yeah. Me as well. And it's, Collins in this realm of, I'm gonna say, you know, key amazing thought leaders in this space that are being very adventurous in what's happening here, in the same category as I would consider Bob Rudis and his explorations with WebR tying into things like observable framework and whatnot. WebR is this engine that is powering so many things. Yes. I've been coming to them mostly from the shiny perspective with Shiny Live, but it is so much more than just that. We even feature, what was it, 2 or 3 weeks on this very podcast, a blog post that had, you know, our counsel basically embedded into the post itself to try out the code that was being being showcased there. Right? Education side, web application side. And now as as that nature article is showing even, you know, high throughput, high HPC like computation in the browser, It's all it's all coming together. It is. I mean, I don't know. I haven't been this geeked out in years that of ways that we can tie our entire data science with a novel technology. And I'm I know I can't stop talking about it, but at the same time, this is the start of something. I still remember sitting at the positconf presentation by Joe Chang at the end there, and all of us are looking at each other across the room is like, yep, we're going with this. We're going to try stuff out and see what happens. Challenge accepted, Joe, if you're here listening to this. So, yeah, really cool stuff to see what Colin's exploring here. And it does show that I still have a lot to learn, but at the same time, I'm gonna enjoy learning about this.
[00:39:28] Mike Thomas:
Likewise. Likewise. And then on the you know, I I take it from a shiny perspective as well. And then with the sort of duct deep, you have to think about the data side of it right and maybe you have an external connection to a database which makes things easier maybe not and you know the fact that there's, now these integrations between DuckDV and and WebAssembly that I think are going to solve sort of that final last piece for us in a lot of ways in terms of connecting the data to the application or or whatever you're showing on screen in an easy way, it's it's incredible.
[00:40:00] Eric Nantz:
Yeah. Absolutely is. And we're gonna be hearing a lot more about this throughout the year. I'll also give a a plug once again that we'll we're thrilled to have George Stagg give a keynote at the upcoming shiny conference coming up in April. So if you're not registered for that, I highly recommend coming to that event as well. And, yeah, my cohost here is gonna have a shiny app on there as well, so we're really excited for that. Yes. That's exciting. Coming up quick. It sure is. It sure is. But, you know, it's quick. You know, it's always a a quick yet very educational read whenever you see our weekly every single week. We don't try to bog you down too much where we give you, you know, the the awesome resources, blog posts, tutorials, as we mentioned at the top, new packages, hitting the ecosystem, updated packages, and much, much more. So we're gonna take a couple of minutes to share some additional highlights here. And going back to the Shiny train for a little bit, I had a thought provoking insight here that it was led by this blog post from Jacob Soboliewski over at Absalon entitled Using Test to Develop Shiny Modules.
Now this is something that usually you don't really think about until you get to the stage where your module is almost done and you're thinking about, okay, how do I make sure that it's robust enough? But, Jacob here does a great outline here about how the concepts of test driven development really come into play. Whereas if you're really iterating on a specific module, there are ways to test it efficiently without having to run the entire app every single time, making clever use of test that functionality and custom functions and whatnot. And this looks like some I'm gonna start looking at as I start revamping some of my major shiny apps or building new ones in terms of making that development cycle of developing modules just a wee bit faster to my to get to get things done quicker as they say. So, yeah, really thought provoking for a post from Jacob here. Yes. I
[00:42:01] Mike Thomas:
found a blog post, an additional highlight here from Alexandros Kuretsis, from Absalon, entitled Our Plumber How to Craft Error Responses That Speak Fluent HTTP. You know, we talk about Shiny a lot, Eric. I've said that a few times today. But, you know, one thing that we talk about a lot is creating the best user experience as possible around our Shiny apps. And a lot of times that includes error handling in a graceful way for the user to understand sort of what went wrong instead of just getting disconnected from the server. Right? That that's what we try to avoid.
And the same principles I believe apply to APIs. You know like plumber APIs. Right? That where if something does go wrong there's going to be an error code that gets sent back, to the other application that's making the request and typically that error code is going to be either of a 400 type error or a 500 type error. And and 500 type errors typically mean that something went wrong, I think, on the server side. Whereas, 400 type errors are are typically, you know, something bad happened, in terms of the the the inputs that went into that request of the API didn't satisfy sort of what the API was expecting, as opposed to, you know, the the server being down or something like that. So understanding sort of the difference between those and being able to to return something more informative, back to the applications that maybe they can create some sort of a UX based upon what type of error code comes back, so the user can understand exactly what went wrong. You know, should they should they fix this particular field that they just filled out incorrectly before clicking a button that sent that API request, you know, or or give them some information about how to potentially rectify the problem, or do they need to contact IT because the the server itself is down. Right? And understanding that difference is is really important. So this is a great blog post that I think walks through, a discussion around that, how to make things safer there. And then I will also shout out a project by, Peter Salamos and his team at Analytium, and the package is called tri r, t r y r, that tries to do the exact same thing. I think it's client server error handling for HTTP APIs, and he has a lot of examples with Plumber there and the same exact idea, you know, that you're trying to provide sort sort of a more informative error code, response back to the application that, sent that request initially. So some great resources here, to to shout out. And I have been knee deep in Plumber lately and and really enjoying it. So this is very timely for me.
[00:44:43] Eric Nantz:
Yeah. Really, really awesome insights there. And I'm also in the train of EVRA helping build new APIs or consuming existing APIs and having our layers on top of that. So, yeah, having any way to give that UX, you know, a much more pleasant experience for not only me as a developer, but my end user who is not gonna give 2 wits about what's actually behind the scenes. They just wanna know what happened and how to fix it. So anything like this to translate to crypto 403s or to 502s or, 69, whatever you wanna call it. They're all cryptic at the end to most statisticians and data scientists. So being able to translate that, and having a robust kind of paradigm for air handling is very welcome in this space. But I think it speaks to this new trend that we're seeing is that we're interfacing with other systems of some sort. I've traditionally been HPC systems, and now I'm really augmenting that with these web services that may or may not be high performing, but at the same time, they're doing one thing. They're doing it well, and they want to be agnostic to what front end we have of it. So, of course, I'm biased to r. Why wouldn't I be? So having to package interfaces with that and making that UX seamless, that's a win for me.
You know what else helps you win? Unlike what's happening to my poor red wings, is that reading our weekly every single week will help you win the game of leveling up your data science knowledge. I tried. I tried. I'm trying to give them good luck for tonight. But, anyway, yeah, every single week we have a new issue online and it's released basically every Monday morning. And then, you know, the train keeps going and we are powered by the community. Right. We, as I mentioned earlier, you know, every single week, we look at your awesome pull requests. And you may wonder, how do I get that on there? It's all linked at rweekly.org. We have a link to the upcoming issue draft right at the top right corner.
We're just a pull request away from that new blog post, maybe that new package that you just created following the advice we just mentioned in the first highlight. Our week is a great way to showcase that. It's all marked down all the time. You know, I've lived marked down lifestyle with my package documentation, my internal blog posts, some of this external stuff I'm doing. You know? Without markdown, if I had to do, like, LaTeX for all this, I would cry. I would just cry, Mike. Thank goodness for markdown. Yes. You are exactly right. Thank goodness for the shiny include markdown function as well. Shout out. Very, very nice. Yes. I've used that heavily and with no regrets at all. Yes. So, yep, all markdown all the time of our weekly, And also, we'd love to hear from you directly as well. There are many ways to do that. We have a contact page linked in the episode show notes right at the bottom of our show notes that you can click to.
We also if you're listening to a modern podcast app, WebPoverse, Fountain, Cast O Matic, CurioCaster, there's a whole boatload out there. You can send us a fun little boost along the way right in your podcast app itself. All details are linked in the show notes as well. And lastly, we are sporadically on various social media outlets. I'm mostly on Mastodon with atrpodcast, at podcastindex.social. Also, on the Weapon X thing from time to time with atrcast as well as LinkedIn. You can find me on there with show announcements and, you know, blog posts and the like. And, Mike, where can the listeners find you? Sure. LinkedIn is probably the best place to see what I'm up to. You can just search Catchbrook Analytics,
[00:48:09] Mike Thomas:
k e t c h b r o o k. And if you wanna find me on Mastodon, you can find me at [email protected].
[00:48:20] Eric Nantz:
Yep. I think we've, put a nice little bow on this episode. But, again, it's been a great recording session once again, Mike, and, we hope to see you all for our next edition of our weekly highlights next week.
Hello, friends. We are back at episode 156 of the R Weekly Highlights podcast. If you're new to the show, this is the weekly podcast where we talk about the latest highlights that have been featured on this week's our weekly issue. My name is Eric Nantz, and I'm delighted you joined us from wherever you are around the world. And, you know, I never do this alone. I have my awesome cohost join right here, my line mate, partner in crime here, Mike Thomas. Mike, how are you doing today?
[00:00:30] Mike Thomas:
I'm doing well, Eric. A little better than the the Red Wings, though. It seems like they've been skidding in the last 3. Come on, Red Wings. Let's let's pick it up here.
[00:00:39] Eric Nantz:
There's been a a bit of anger, and there's speculation that I think you're familiar with this and those that follow sports in general in the US are familiar with this. There's more advertising now on players' jerseys. They literally just put a patch for, of all things, a trash company on the Red Wings jersey. Oh, that's bad. They are winless since then. Now I'm not one of those people who's gonna say this is exactly a correlated event, but I'm just saying, couldn't they have waited till next year? I'm just saying. So
[00:01:14] Mike Thomas:
Yeah. We'll see, Mike. Yeah. Well, it's certainly correlated, but we'll hope it's not causal.
[00:01:19] Eric Nantz:
Exactly. Yes. Thank you for cleaning that up. Ironically, talking about trash coming and cleaning it up. Well, luckily, you know it's not trash here. What we're talking about here today, we don't have to worry about losing streaks here. We're on a a hot streak of awesome highlights this year for sure. And our curator this week is the esteemed Colin Faye, of course, the architect of all things GOLM and many of our shiny and web in general technology tools, which we'll be talking about later in this episode. But as always, he had tremendous help from our ROK team members and contributors like all of you around the world with your awesome pull requests, suggestions and general feedback.
So let's get right to this. Right? And one of the kind of rites of passage, you might say, as you develop your R skills and your journey into data science, do you have that great idea for maybe that new analysis technique, maybe that new data source that you want to make as easy as possible for yourself and potentially others to bring into R and do some cool analysis with? Well, that, of course, is running a package. Right? This used to sound so intimidating, but with the frameworks that we've been featuring heavily on our weekly highlights since the life of this show and the life of our weekly in general, there are lots of amazing tools in place that get you started right on that journey to create your package.
And let's say you've used those tools. You've got an awesome package ready to go. But you might ask yourself, now what? How do we exactly get this in the hands of our users? And our first highlight is coming from, once again, the very awesome rOpenSci blog authored by Ioannini Balenis Salbin and Ma'al Salmon returning once again with their series that is inspired by their recent workshops with our open side champions about how you can promote and release your r package to the world. And so like I said, Mike, first step is just getting it out there at all. What kind of advice do you have for us here?
[00:03:20] Mike Thomas:
Yeah. Well, I I mean, I'll even set the tone before that. Creating an r package to me is just such a great idea to try at some point because it sort of forces you to use a lot of great best practices around writing good software around the the science that underlies what you're trying to do. So I would highly recommend taking a stab at creating a package if you have never tried to do so before, I think you'll find it a pretty rewarding experience and I think it'll make you a better better programmer. But once you have created that package, you're exactly right, Eric. How can we how can we market it? And and one of the first steps, which I wholeheartedly agree with that Yanina and Mael recommend is to create a great read me.
I can't stress this enough, you know, your read me will help others, understand exactly what it is your package does, how to install it, and maybe a few different examples of how to get started using it. It may even, as you create that Readme, help you sort of refine your idea around what you actually wanted this package to do and may cause some changes to your actual functionality not speaking from experience or anything like that but it's it's one of those, exercises kinda like a rubber duck I think that forces you to explain exactly, you know, in in layman's terms, non code terms, what your package is trying to accomplish. And and it's a a great idea.
Spend all the time creating, you know, the coolest hex logo that you possibly can as well. Not that I've wasted hours doing that at the end of a project, but that's a super fun part of it as well and I think that from a marketing perspective any sort of visual fun aids can help market your your package as well if you use GitHub you Nina and Mel recommend that you pin that package repository to your profile so that as soon as somebody visits your GitHub that'll be sort of the first thing that they see that stands out and that's a great idea as well. The next, the next recommendation that they have around publishing is one that I need to take to heart because I have not done this yet and it is create a universe on our universe, which as we've talked about on the podcast before, is this absolutely incredible resource created by your own. And, it is a phenomenal place to to host your packages that automatically, I think, displays the documentation and metadata around your package in just a beautiful really accessible way.
And also, I think allows others to install it it very quickly because I think in in some cases it'll build binaries if I have that correct. Yes. It is building binaries. Yep. Okay. Which can make the installation experience a lot better for your users and then you know after that sort of I guess the the holy grail right would be potentially public publishing it to crayon which is sort of the final way to make installing your package probably the easiest to the largest array of users, across different experience levels out there.
And one thing that that I also hadn't thought of it as well, and if you're familiar with the rOpenSci project, is that they have a peer review process which is a phenomenal thing that is in place to sort of ensure robustness and rigor around your R package and ensure, again, that you're using, some of the best practices around software development and creating an R package. It's not something that I've done before. I I think that it's a fantastic fantastic resource and it's something that I wanna take advantage of, to me, and I think maybe this blog is debunking that that myth a little bit. I was never sure that my packages were sciency enough, for our OpenSci to necessarily, consider, you know, peer reviewing the work that I've done. But, as I know that they have office hours and and things like that that are publicly available, I would definitely recommend that folks take advantage of these resources that they offer. And in that peer review process, I think, as they mentioned, they may catch a lot of things, that you might get flagged on when submitting to CRAN. So they may help you expedite that process of getting your package onto CRAN.
So those are their recommendations around publishing. And then maybe, Eric, you can talk a little bit about their recommendations around promoting your package.
[00:07:46] Eric Nantz:
Yes. And this is a a skill set that is admittedly sometimes not intuitive to many of us, especially as we're new to this this, situation of getting the package out there but trying to get it into the hands of the user base that we intended to have, especially as we wanna garner initial feedback and, frankly, make make a positive difference. Right? And so there are lots of interesting ways. And, again, maybe not a one size fits all for everybody, but I think the advice here in general is quite sound. One of those is taking advantage of the lowest friction to get this out there on various either social media or other publishing platforms.
One of them, of course, I'm not gonna be ashamed to say I'm biased with this, is, hey. Send it to us at our weekly. Right? We have a section every single week on new packages that have been released to the R ecosystem, whether they're on CRAN or on GitHub only or or a GitHub like repository. And we always also link to the R Universe project in every issue as well. So that is, again, we are, as I say at the end of the show all the time, a poll request away from making that announcement on our weekly. We definitely recommend you take advantage of that.
And, also, if you wanna spread the word on kind of your intention of the package and maybe some more up to date notes from yourself to your audience, another great way is to start your own blog. Right? There have been plenty of frameworks now in the R community that will make creating a blog with markdown super easy. I, of course, speak with great success with the blog down package. Now Quarto, of course, that we can talk about routinely on the highlights has its own mechanism for creating a website with a blog component. So those would be another terrific way that you could spread the word about your package and then posting that on various social media channels such as Mastodon, LinkedIn, some of the others out there.
And those, again, are great ways to get the word out. Again, not natural for yours truly to do all this, but over time, you really start to see some really great nuggets come out in the community as you follow these feeds, and you could definitely put your package as one of those items in those feeds. And then as you as you said, Mike, going through the rOpenSci process from the creation and peer review process of your package is a terrific way to enhance the quality. I personally have not done it, but I've lived vicariously through my esteemed friend Will Landau and his peer review process for targets, which is you know, why we recognize in the R community for innovations, to say the very least. And, yeah, rOpenSci has done a tremendous job with targets.
And one of the other bits of advice that are in this post here is that if it is on rOpenSci, they have additional channels to market your package, such as featured tech notes that are going on their our open side blog, which is what we're reading through right here as we speak. And also, as you mentioned, they also have community calls and even can set up dedicated working sessions where maybe you, as a package author, wanna give a chance for, you know, a prospective user to hop on a video working session with you. And you can talk about the package and maybe debunk some issues, But it's another great way to get the word out because those are very, you know, relaxed atmosphere, just, you know, practical discussions, and another excellent way for your users to learn about, the way your package works if you're in the rOpenSci ecosystem here. And, of course, social media and blog posts are just one way to get the word out. Another terrific way, especially in the community of the R community, the worldwide presence of these user groups and also the R Ladies user groups.
Another terrific way to maybe have a short presentation or a short working session, maybe an online workshop of 1 of these working groups. Another terrific way to get the word out about your package. I've seen some really great great showcases of that throughout the years on these various online forums now, especially since the pandemic. Many of these are remote. They're sharing recordings on YouTube or other video channels. Another terrific way to get the word out. And I dare say another fun way, if you're really adventurous, to do a little livestream once in a while like I used to do in the past, which I hope they get through someday. But, again, there are many different avenues for you to get your get your package out there. And certainly, if it is a very scientifically focused package, there are some very well renowned manuscripts out there such as getting it into the R Journal itself or the Journal of Statistical Software, many others that are domain specific as well, which, of course, in my field, we do a lot of literature review. We're seeing a lot of new algorithms being published, and they often have an accompanying R package to go with that publication.
Another very traditional yet very powerful mechanism for getting the word out there. And one little bit, Mike, as you said, you were really sure if a package you're creating is, quote unquote, scientific enough for an rOpenSci, you know, scope. But what we'll what we'll link to in the show notes is, a section on their online peer review book where they do talk about the intended scope that they look for with respect to bringing a package on board to rOpenSci. And the great news is that doesn't have to be a very focused domain specific scientific algorithm or method. They have many packages that are involved in making data more accessible, making APIs more accessible. There are lots of interesting domains here that, you know, could be a good fit. Again, may not be for everybody, but, again, if it does fit in that scope, you might benefit greatly by going on our open side.
And certainly, I'll also speak on the perspective of those in an industry where maybe you don't get a chance to publicize this to the worldwide, our community, until you get the, quote, unquote, blessing of getting an open source. Maybe you have to deal with this internally at your organization. If you have a large organization, how do you make sure that your user base within the company get their eyes on it? I'll go back to what I said maybe a few minutes ago. Having an internal blog is a cool thing to do too. I'm actually trying this out now as I speak, doing a little portal blog for our internal group to share package announcements that our group is creating and having those broadcasts on either some newsletter or some other, you know, distribution service within the company. So it's not just us making these cool packages out there and then, you know, others in statistics or data sciences not not getting the word of it. We're gonna find ways to get that message across. So I think a lot of these principles can apply to those of you that I'm gonna steal a phrase from my friend Michael Dominic with the Coda Radio podcast.
You dark matter developers out there, I see you. I know you're out there. There are great ways you can take some of these techniques internally at your respective organizations, too. So really, really great blog post. Gives you a lot of great ideas to follow-up on. And again, I think just getting the package out there, you've done immense work to do that. It's a journey. I know how it goes but getting the word out there so that your user base can get their hands on this, is, you know, a really critical component to making sure that you can make the package even better as your users get their hands on it. So, yeah, really great advice here, and I highly recommend what we're talking about here. Yeah. I know. And that's a great point, Eric, too, because not all of us can share
[00:15:41] Mike Thomas:
our our packages with, you know, the the general public in the outside world, but I I think you absolutely can take these principles and leverage them within the communities inside your own organization, through whatever means, you know, necessary that you have available to to do that, but but still leveraging these principles I think is a great idea.
[00:16:10] Eric Nantz:
Now what if, Mike, you're in the situation of you've you built that package, but maybe it was, I don't know, 5, 6, maybe even 10 years ago. You look at the code base and you realize, oh, past me did that. If past me knew what present me knows now, I probably wouldn't have done it that way. You might be in the situation where your package may deserve a bit of what we'll call spring cleaning, as they say. And so our next highlight, comes to us from the Jumpy Rivers blog. We featured them quite heavily on highlights in the past, authored by Rhian Davies, and is appropriately entitled Spring Cleaning Your our packages. And they start off with relating to that Jumpy Rivers themselves have put many packages on, say, GitHub or CRAN and whatnot, and some of them develop it more than, say, 5 years ago.
And then as we learn I mean, I'm a continuous learner, as they say. Lots of new, you know, best practices, maybe modifications to existing best practices. And then you realize, yeah, you know what? I should try some of that in my legacy package that needs a refresh. How do we go about that efficiently? There are some practical things you can start with, one of which is if your package is on a GitHub or GitHub like repository. There was about 3 or 4 years ago, Mike, there was a movement to change the nomenclature of default branches.
We won't get into all the details here, but the connotation of master didn't exactly sit well with many people in today's, you know, communities. So there's been a movement to change that to a more friendly term such as Maine or something like that. And so there is an easy way to rename your branch right away. And we're going to talk about this heavily on this segment. They use this package author by Posit is the superstar here, so to speak. We're getting these tips operated on efficiently with as less friction as possible. So there is a handy function called get default branch rename.
It's a long function name, but it literally says what's on the tin, what it does. So once you do that, your branch becomes main. You can push that up to GitHub, and you are all set. But, of course, it doesn't end there. We've got we might have some additional things that we wanna tidy up with respect to the package metadata. This was new to me, Mike. I'm curious if you knew about this before, but there were times I wrote my description file for a package, frankly, by hand back in the old days. Yep. You remember. And so I'm updating a package that, again, I've lost there, like, 8, 9 years ago.
And there is this little gem that's in this blog post Now if you wanna tidy that up so that things like you have fields in more alphabetical order, maybe spacing correct, making sure that consistent names with maintainers and and author fields are correct, There is a use tidy description function that will basically put everything in the standard order, alphabetize the dependencies, and make sure everything just looks really tidy. So if you happen to do all that manually, that is awesome. I love seeing that. And then another part is we don't always want to do things ourselves all the time, right?
We're in 2024. There is a new technology in certain platforms such as GitHub to automate many of these checks in action. And, Mike, there is even more use this magic for getting that set up for you, isn't there?
[00:20:00] Mike Thomas:
Yes. There are. And as we talk about things that we used to do that we no longer do. Right? Travis CI, I think used to be that the most popular tool for continuous integration, which is, you know, running code essentially, on maybe some separate server. And a lot of times this was around running tests, ensuring that all your unit tests pass, when you create a pull request before that pull request actually gets merged into the main branch. Nowadays, we're using continuous integration quite a bit as well for, like, creating package down sites, and updating, you know, what's shown in that that, package down branch of your repository that spins up the the whole package down site that folks can go to and see the beautiful version of your your packages documentation. So, Travis c, I sort of used to be the only game in town, but it's GitHub actions now.
You know, I think I think that's probably the primary way that folks are going these days. And fortunately, again, use this package, allows you to easily create, that continuous integration GitHub action with, the function. Stop me, if you're not expecting this, but the function name is use GitHub action. And you can, supply sort of what type of check you want to create and that essentially creates a whole entire YAML file that will execute, the unit tests in your package, run those tests, when a PR takes place, I think, for the most part. And you you can alter that YAML file if you want to, you know, change when, those checks get fired, or or other certain specifications of those unit tests, getting run-in this continuous integration, situation.
So there's a lot of different options here, within, you know, being able to use GitHub actions depending on sort of how strictly you want to run the tests on your package, and then, you may also want to take a look at it and see how much test coverage quote unquote, your package has, which is sort of the ratio, I believe, of the number of lines of unit testing code versus the number of lines of of sort of our code that your functions themselves have, and and you wanna sort of ideally be as as close to a 100%, I think, as possible. I think there's a lot of a lot of opinions out there on that that we don't necessarily need to get into. But it's probably a good idea to show your users sort of in general at a high level that you are are writing a lot of tests around the functionality for your package to ensure that, you know, your logic is doing what you expect it to do, and that it continues to satisfy those expectations as you you make changes and and refactor over time.
And the last two things that, they recommend are 1, creating a hex sticker, which is a callback to our our first blog as well, which I will highly recommend. I use a site called Canva, which I think I pay a couple bucks a month for, but, it is it's pretty incredible just the the stuff that you can do. And nowadays, it seems like everybody's leveraging, you know, these generative a ai models to, sort of write a prompt of what you want shown on your HEX logo and then, you know, that'll spit out a wild image for you that that you can crop to, a HEX background. In my case, I I do that pretty easily with Canva. And, we've been I've spent way too much time on that recently but it's it's super fun and can be a fun way for folks to, it can be the first thing that they see when they navigate to your package down site or, you know, browse the vignettes within in your package, in the RStudio IDE.
And I think it can can sort of create some excitement and engagement around your package. And then the last thing, that they will recommend is, contributing and code of conduct, adding that to your repository in your package as well to let users know the best way, and sort of the guidelines and principles that you expect people, that want to contribute to your package, to contribute to your package in in a friendly way, in a safe environment that, works sort of for everybody that that's working on that that repository. So some excellent excellent spring cleaning, if you will, ideas and and examples from Jumping River. Spring is here, and I think it's it's time for all of us to start diving into these.
[00:24:27] Eric Nantz:
Yeah. I've literally been living this life for, like, 3 weeks now with this legacy package and seen so many areas that need a little attention, a little package and seen so many areas that need a little attention, a little cleanup here and there. And so all these principles either have or will take action quite a bit. And back to HEC stickers. Yes. Yours truly did revise a HEC sticker for this legacy package. I'm gonna give a quick plug. If you want, in the in the blog post here, they're referencing the hex sticker or, I believe, the sticker package or hex sticker package, easy for me to say, as in our way of doing it. And then, also, if you wanna stick with VAR about bringing all shiny in it, many years ago, Colin Faye, as part of a shiny contest submission, released the Hex Make Shiny app where you can literally create a hex sticker, superimpose an image on top all within a Shiny app and download it. So I actually use that literally to make a new hex sticker for my internal package.
That was that was a lot of fun. So I'm never shy to plug that that fun shiny app for my bookmarks as well. And, honestly, yeah, back to the contributing guidelines, when I would build these legacy packages, you know, back then, maybe I was naive. It always seemed like it would be just me, so I didn't put a lot of thought into it. But you know what? The these packages, again, I think can really thrive when you have somebody at least that wants to be active with you, if not maybe developing day to day, but at least, you know, helping you test things out. Maybe they're a liaison to other users, and then they have feedback, and then they can, you know, find the best way to, you know, help you with that feedback. But you wanna give them the easiest way to get started with that. So these contributing guides, whether your package is open source or within the confines of your industry firewall, I think those are critically important to make sure that you give these others that maybe are willing to step in this. They'll know where to start. Things like a contributing guide, also making good use of issue labeling in your whatever your system is for issue ticketing, things like good first issue or help wanted or, you know, you know, things like that. You know? And, obviously, it's project specific, of course, but making it as easy as possible for people to really drill down to see which areas they can contribute to the most. So, again, great things to think about as you're already in the midst of making your pack as a little more tidy along the way. Yeah. Really good advice here.
Well, I teased this earlier, Mike, but, our curator here has been hard at work not just curating this issue, but, Colin has been knee deep in this learning journey and this saga of supercharging his workflows with WebAssembly. And in fact, we are gonna be talking about in the last segment here, it is, I believe, the 6 posts in his series of exploring WebR and WebAssembly with respect to, you know, interactive web applications. And what we're going to talk about here, what seems to be kind of a culmination of everything he's been learning here, is the idea of having new tools available that within, say, the native JavaScript world of bringing WebR, WebAssembly powered by R into these, these applications via 2 new utilities that work in tandem. So let's dive right into this where earlier in his explorations, he's been prototyping some interesting use cases of, say, converting an existing Shiny app in the WebR, preloading packages in an Express JS API, you know, bringing your own functions in WebR and then building them into the the the Node. Js app and whatnot.
Well, he realizes that a lot of that was, you know, kind of, you know, piecemeal learning a bit here and there ad hoc. What if we wanna take those best those practices that he's been outlined into a very easy way to make it happen that might have some parallels to what we get in the R community when we build packages with, say, dev tools, use this. And in the Shiny situation, of course, what Collins authored with the golem package. What's a way to bring that all together in this native kind of WebAssembly and JavaScript world? So in this blog post, he announces 2 new utilities to make this happen, one of which is called WebRCLI, which again is going to be very similar to kind of a dev tools use this paradigm along with other functionality we'll get to later that's going to help you create a no JavaScript project, but with the bindings to WebR already baked inside, things that he was building manually in the earlier stages of this journey.
This package or this utility is going to bootstrap that for you, not too unlike what you would do with use this and say use this use package or create package. I forgot the exact name of it, but it's where it gives you the scaffolding of an R package right away. And then it's up to you to fill in the blanks, if you will. This is doing a very similar framework with, again, the WebAssembly piece of all this. And so that in tandem with the other utility, which is called Spider, which is looking like a way to build extensional functionality on top of WebR itself, such as what we get in typical R installations where we want a package from CRAN or or maybe even from GitHub with the remote package.
We have functions to literally install that package. Right? Install that packages or remote install GitHub or whatnot. Spyder is giving you a utility to use a native JavaScript function that will look very similar to those installation commands, and it's giving you those built on top of WebR to bring those packages down to your local project. This, this is kind of amazing to me. It's not just taking their installation. It is putting them in a project specific directory that, if you're familiar with r m, will look very similar. I went through the GitHub example that we'll have linked to in the show notes.
He ignores this, but I and his GitHub ignore. But what I did is I cloned this locally to give it a try. And sure enough, there is a directory that when you go inside it, it will it's called webrpackages. You go in there, It will look very similar to your r mv library where you download packages. This is fascinating to me. Colin has figured out how to load these packages from a file store into these WebAssembly powered applications. This is massive to me because where I'm going with this, ongoing pilot submission with WebAssembly, we want to explore ways of not just grabbing packages from the WebR binary repository on the fly, so to speak.
But should we want to distribute packages as part of a bundle? How do we bring those into the application locally? So I will be looking into this quite closely to see if I can take some nuggets from this, whether it's for this particular pilot or for future explorations the CI can mirror this with things like Shiny Live, that we're using right now in our pilot submission. So the wheels are turning after I read Collins post here. But this is again, this is all fascinating to me. So one thing you notice that we didn't mention here is that we're not talking about Shiny here. Right? He is speaking on behalf of those that maybe are familiar with JavaScript native ways of building a web application, but you have a function in R or a package in R you wanna leverage as the back end to that Node. Js or other JavaScript like app, this set of utilities is your way to make that happen.
And I definitely invite you, if you do have, you know, Node. Js and NPM installed on your machine, give this a shot. I literally ran through the blog post this morning, and everything worked to a tee. Everything worked as advertised. So this, I think, is opening a lot of possibilities here. But as we often say in these explorations, it's early days. He has not tested this over than a few examples, and he is very eager to get community feedback on how this goes for those also that are willing to explore this kind of blazing trail, if you will, of this of this new journey here.
So there's notes at the end about how he kinda pulled this off from, like, a back end perspective. But again, he's looking for feedback on this, and I definitely am intrigued by what I'm seeing here. And I can't wait to to learn more about how this works under the hood. But this is a great time to talk about WebAssembly right now because I'm thrilled to say as of yesterday when we record this episode, there is a fascinating new article released by Nature authored by Jeffrey Pirkel. It's entitled no installation required, how web assembly is changing scientific computing. And I'm humbled to say to yours truly has a small little quote in here based on our current explorations, but I will say this is the kind of stuff I am super excited about. We are trying to push the envelope here.
We think there is a massive potential in many industries for this. Of course, I'm coming from life sciences, but there are many, many others that I think can make heavy use of WebAssembly. This article has terrific narrative around kind of the genesis of this from George Stagg himself as he started prototyping WebR along with other members of the scientific community and how they're showcasing the use of this technology. So great time for me to see this. And, again, super excited to dive into what Colin's exploring here and see how we can supercharge this in the future.
[00:34:47] Mike Thomas:
Yes. Me as well. And, Eric, I'm glad that you you shouted out that article because if you didn't, I was going to you know, when I first started doing this podcast, I was starstruck that I got to record with the, you know, the the host of the r podcast and the Shiny Dev series. Then I think I've gotten a little comfortable, with you, but but now you are featured in Nature Magazine and I'm right back where I started. So hats off hats off to you that that that is an awesome awesome accolade and it's it's super exciting as well to see that, you know, the scientific community is is talking about this stuff as well. And it's not just us software nerds, you know, that that are the only ones caring about this, that it's it's really something that other folks seem to be seeing as well as a pretty revolutionary thing that's starting to come into the ecosystem. And and fortunately, we do have folks like Colin who are at the the cutting edge of, the this WebAssembly stuff. You know, Colin curated this week, and and when I saw the blog post, I thought this
[00:35:55] Eric Nantz:
was a little bit of insider trading, but I'm very glad that, I'm very glad that this one made the highlights. It's a great example.
[00:35:57] Mike Thomas:
You know, one of the the the toy examples here is is, called this WebR SpongeBob, example that he has, which I think just sort of allows you to, you know, essentially change some text, a string that you write to what's called sponge case. A quick story, during a particularly slow period a couple years ago, for me, I highly considered creating this exact r package. I didn't end up doing it, and and glad I didn't because it looks like maybe Colin was the one who created the spongebob package. I'm not not sure if it was him or somebody else, but, somebody somebody took care of it for all of us. Obviously, that's a very important package in the art community. So it's nice nice to have that one out there. But this is a phenomenal guy. To use your own. Right? To use your own. Yeah. And like you said, you know, it's incredible. I haven't actually tried it myself, but it sounds like you maybe forked the repository as well and ran through this and found that there were no issues. Obviously, I think Colin in both the the read me's in these repositories and this blog post as well makes a lot of disclaimers that, this is very early on, very experimental.
You know, expect a lot of bugs. He, I think may already be seeing some bugs and edge cases that he's hoping to to solve, but regardless, I think the the fundamental concepts here of what's being done are are really driving sort of this this idea in this space forward about, you know, that this this web assembly topic and not needing to manage dependencies, in ways that traditionally were were a little difficult and sort of making that much easier and much more accessible to a wider variety of people, which is is incredible. So I'm excited to see, how this this continues and patiently waiting on on blog post 7.
[00:37:45] Eric Nantz:
Yeah. Me as well. And it's, Collins in this realm of, I'm gonna say, you know, key amazing thought leaders in this space that are being very adventurous in what's happening here, in the same category as I would consider Bob Rudis and his explorations with WebR tying into things like observable framework and whatnot. WebR is this engine that is powering so many things. Yes. I've been coming to them mostly from the shiny perspective with Shiny Live, but it is so much more than just that. We even feature, what was it, 2 or 3 weeks on this very podcast, a blog post that had, you know, our counsel basically embedded into the post itself to try out the code that was being being showcased there. Right? Education side, web application side. And now as as that nature article is showing even, you know, high throughput, high HPC like computation in the browser, It's all it's all coming together. It is. I mean, I don't know. I haven't been this geeked out in years that of ways that we can tie our entire data science with a novel technology. And I'm I know I can't stop talking about it, but at the same time, this is the start of something. I still remember sitting at the positconf presentation by Joe Chang at the end there, and all of us are looking at each other across the room is like, yep, we're going with this. We're going to try stuff out and see what happens. Challenge accepted, Joe, if you're here listening to this. So, yeah, really cool stuff to see what Colin's exploring here. And it does show that I still have a lot to learn, but at the same time, I'm gonna enjoy learning about this.
[00:39:28] Mike Thomas:
Likewise. Likewise. And then on the you know, I I take it from a shiny perspective as well. And then with the sort of duct deep, you have to think about the data side of it right and maybe you have an external connection to a database which makes things easier maybe not and you know the fact that there's, now these integrations between DuckDV and and WebAssembly that I think are going to solve sort of that final last piece for us in a lot of ways in terms of connecting the data to the application or or whatever you're showing on screen in an easy way, it's it's incredible.
[00:40:00] Eric Nantz:
Yeah. Absolutely is. And we're gonna be hearing a lot more about this throughout the year. I'll also give a a plug once again that we'll we're thrilled to have George Stagg give a keynote at the upcoming shiny conference coming up in April. So if you're not registered for that, I highly recommend coming to that event as well. And, yeah, my cohost here is gonna have a shiny app on there as well, so we're really excited for that. Yes. That's exciting. Coming up quick. It sure is. It sure is. But, you know, it's quick. You know, it's always a a quick yet very educational read whenever you see our weekly every single week. We don't try to bog you down too much where we give you, you know, the the awesome resources, blog posts, tutorials, as we mentioned at the top, new packages, hitting the ecosystem, updated packages, and much, much more. So we're gonna take a couple of minutes to share some additional highlights here. And going back to the Shiny train for a little bit, I had a thought provoking insight here that it was led by this blog post from Jacob Soboliewski over at Absalon entitled Using Test to Develop Shiny Modules.
Now this is something that usually you don't really think about until you get to the stage where your module is almost done and you're thinking about, okay, how do I make sure that it's robust enough? But, Jacob here does a great outline here about how the concepts of test driven development really come into play. Whereas if you're really iterating on a specific module, there are ways to test it efficiently without having to run the entire app every single time, making clever use of test that functionality and custom functions and whatnot. And this looks like some I'm gonna start looking at as I start revamping some of my major shiny apps or building new ones in terms of making that development cycle of developing modules just a wee bit faster to my to get to get things done quicker as they say. So, yeah, really thought provoking for a post from Jacob here. Yes. I
[00:42:01] Mike Thomas:
found a blog post, an additional highlight here from Alexandros Kuretsis, from Absalon, entitled Our Plumber How to Craft Error Responses That Speak Fluent HTTP. You know, we talk about Shiny a lot, Eric. I've said that a few times today. But, you know, one thing that we talk about a lot is creating the best user experience as possible around our Shiny apps. And a lot of times that includes error handling in a graceful way for the user to understand sort of what went wrong instead of just getting disconnected from the server. Right? That that's what we try to avoid.
And the same principles I believe apply to APIs. You know like plumber APIs. Right? That where if something does go wrong there's going to be an error code that gets sent back, to the other application that's making the request and typically that error code is going to be either of a 400 type error or a 500 type error. And and 500 type errors typically mean that something went wrong, I think, on the server side. Whereas, 400 type errors are are typically, you know, something bad happened, in terms of the the the inputs that went into that request of the API didn't satisfy sort of what the API was expecting, as opposed to, you know, the the server being down or something like that. So understanding sort of the difference between those and being able to to return something more informative, back to the applications that maybe they can create some sort of a UX based upon what type of error code comes back, so the user can understand exactly what went wrong. You know, should they should they fix this particular field that they just filled out incorrectly before clicking a button that sent that API request, you know, or or give them some information about how to potentially rectify the problem, or do they need to contact IT because the the server itself is down. Right? And understanding that difference is is really important. So this is a great blog post that I think walks through, a discussion around that, how to make things safer there. And then I will also shout out a project by, Peter Salamos and his team at Analytium, and the package is called tri r, t r y r, that tries to do the exact same thing. I think it's client server error handling for HTTP APIs, and he has a lot of examples with Plumber there and the same exact idea, you know, that you're trying to provide sort sort of a more informative error code, response back to the application that, sent that request initially. So some great resources here, to to shout out. And I have been knee deep in Plumber lately and and really enjoying it. So this is very timely for me.
[00:44:43] Eric Nantz:
Yeah. Really, really awesome insights there. And I'm also in the train of EVRA helping build new APIs or consuming existing APIs and having our layers on top of that. So, yeah, having any way to give that UX, you know, a much more pleasant experience for not only me as a developer, but my end user who is not gonna give 2 wits about what's actually behind the scenes. They just wanna know what happened and how to fix it. So anything like this to translate to crypto 403s or to 502s or, 69, whatever you wanna call it. They're all cryptic at the end to most statisticians and data scientists. So being able to translate that, and having a robust kind of paradigm for air handling is very welcome in this space. But I think it speaks to this new trend that we're seeing is that we're interfacing with other systems of some sort. I've traditionally been HPC systems, and now I'm really augmenting that with these web services that may or may not be high performing, but at the same time, they're doing one thing. They're doing it well, and they want to be agnostic to what front end we have of it. So, of course, I'm biased to r. Why wouldn't I be? So having to package interfaces with that and making that UX seamless, that's a win for me.
You know what else helps you win? Unlike what's happening to my poor red wings, is that reading our weekly every single week will help you win the game of leveling up your data science knowledge. I tried. I tried. I'm trying to give them good luck for tonight. But, anyway, yeah, every single week we have a new issue online and it's released basically every Monday morning. And then, you know, the train keeps going and we are powered by the community. Right. We, as I mentioned earlier, you know, every single week, we look at your awesome pull requests. And you may wonder, how do I get that on there? It's all linked at rweekly.org. We have a link to the upcoming issue draft right at the top right corner.
We're just a pull request away from that new blog post, maybe that new package that you just created following the advice we just mentioned in the first highlight. Our week is a great way to showcase that. It's all marked down all the time. You know, I've lived marked down lifestyle with my package documentation, my internal blog posts, some of this external stuff I'm doing. You know? Without markdown, if I had to do, like, LaTeX for all this, I would cry. I would just cry, Mike. Thank goodness for markdown. Yes. You are exactly right. Thank goodness for the shiny include markdown function as well. Shout out. Very, very nice. Yes. I've used that heavily and with no regrets at all. Yes. So, yep, all markdown all the time of our weekly, And also, we'd love to hear from you directly as well. There are many ways to do that. We have a contact page linked in the episode show notes right at the bottom of our show notes that you can click to.
We also if you're listening to a modern podcast app, WebPoverse, Fountain, Cast O Matic, CurioCaster, there's a whole boatload out there. You can send us a fun little boost along the way right in your podcast app itself. All details are linked in the show notes as well. And lastly, we are sporadically on various social media outlets. I'm mostly on Mastodon with atrpodcast, at podcastindex.social. Also, on the Weapon X thing from time to time with atrcast as well as LinkedIn. You can find me on there with show announcements and, you know, blog posts and the like. And, Mike, where can the listeners find you? Sure. LinkedIn is probably the best place to see what I'm up to. You can just search Catchbrook Analytics,
[00:48:09] Mike Thomas:
k e t c h b r o o k. And if you wanna find me on Mastodon, you can find me at [email protected].
[00:48:20] Eric Nantz:
Yep. I think we've, put a nice little bow on this episode. But, again, it's been a great recording session once again, Mike, and, we hope to see you all for our next edition of our weekly highlights next week.