We kick off 2024 with a jam-packed episode! Learn four ways to streamline your R workflows, a proposal for a new pipe assignment operator in base R, and our raw responses to a surprising turn of events affecting one of the most influential members of the R community.
Episode Links
- This week's curator: Eric Nantz - @theRcast (Twitter) & @[email protected] (Mastodon)
- Four ways to streamline your R workflows
- The case for a pipe assignment operator in R
- Bye, RStudio/Posit! - After writing all these "*down" packages for these years, here I am to announce "Yihui-down"
- Entire issue available at rweekly.org/2024-W02
Supplement Resources
- R-Weekly Curation Calendar Dashboard https://rweekly.github.io/rweekly-calendar/
- Quarto All the Things workshop from R/Pharma 2023 https://www.youtube.com/watch?v=k-dQ36sx4Rk
- Rami Krispin's VS-Code R container template repository https://github.com/RamiKrispin/vscode-r-template
- Assignment pipe operator discussion on Mastodon https://mastodon.social/@eliocamp/111664623134443564
Supporting the show
- Use the contact page at https://rweekly.fireside.fm/contact to send us your feedback
- R-Weekly Highlights on the Podcastindex.org - You can send a boost into the show directly in the Podcast Index. First, top-up with Alby, and then head over to the R-Weekly Highlights podcast entry on the index.
- A new way to think about value: https://value4value.info
- Get in touch with us on social media
- Eric Nantz: @theRcast (Twitter) and @[email protected] (Mastodon)
- Mike Thomas: @mike_ketchbrook (Twitter) and @[email protected] (Mastodon)
[00:00:03]
Eric Nantz:
Hello, friends. Did you miss us? Yes. The R Weekly Highlights podcast is back and it is 2024. We are kicking off the new year in style. We have a supersized issue to talk about. But if you're new to the show, this is the show where we talk about the latest our weekly issue that's curating the latest and greatest in our content, whether it's, adventures in data science, new packages, tutorials, and much more. My name is Eric Nantz, and I'm always delighted that you joined us from wherever you are around the world. And I certainly hope you and your loved ones, family, friends, all had a safe and relaxing holiday season, whichever ones you celebrate.
Yeah. It's hard to believe it's 2024 already.
[00:00:45] Mike Thomas:
But just like last year, I don't do this alone. I got my awesome co host, Mike Thomas, joining me once again for another year of our weekly highlights. Mike, how are you doing today? I'm doing well. Eric, I'll be honest with you. I missed this. I hope the audience missed it as well. I missed it just as much as you folks. So I am so so excited that our weekly is back, that we had a great curator this week, and, the 2024 is hopefully now off and running for us here on our weekly.
[00:01:16] Eric Nantz:
Yes. It is amazing how this makes me feel a little more normal again going back on the mic with you, so to speak. It was definitely a busy break for me, but, yeah, it's great to great to get things back in motion on the our side of it. And, yeah, that curator, that wacky individual, yeah, that was actually yours truly this time. And I really went above and beyond in a different sense, and this is totally unrelated to the issue itself. But I figured I've seen some cool stuff built with Quarto lately. I saw the new dashboards. We've actually talked about on this very show recently. I thought, you know what?
For our curator team, we often have to share this rather cryptically formed URL of our public kind of curation calendar. I hosted on Nextcloud, which is great, by the way. Nothing against Nextcloud. But the URL they give me for sharing, yeah, that's just not gonna be easy to remember. So I thought, well, wait a minute here. Guess what? Because of my streaming adventures years ago in making what I call the streamers calendar, I know there are some HTML widgets out there that could play nicely with Cortl to render this same calendar, but, hopefully, a more friendly dashboard and give us both a calendar view and kind of a more tabular view of who's on which week for the curation.
So, yeah, I took a few days. I now have a portal dashboard of our public curation calendar, which I'll link to in the show notes. It's hosted on GitHub Pages, but it takes heavy inspiration from Garrigate and Buoy's Norfolk data portal dashboard. And I figured, you know what? If he can do it, so can I? So that is now publicly available. I have a link to it in the supplements of the show notes here in case you wanna have a look, but it was my first adventure with portal dashboard. So after I got that squared away, yeah, it became down the business and curating this week's issue.
And also, this is, like I mentioned, a supersized issue because we were off for a few weeks as a whole team. So there are a collection of stories here from previous weeks that would have been released if we weren't on break. So either way, buckle up. We got a lot to talk about, and there's a lot to read after the show too. But I can never do this alone. I have tremendous help from our fellow Rwicky team members and contributors like you around the world with your awesome poll requests to make all this happen. And we lead off with kind of a nice kind of retrospective like post of of areas that we've talked about last year, especially with respect to how you can streamline your R workflows.
And our our author this week is the esteemed Nicole Rennie who has been a frequent contributor to our weekly highlights in the past. And she ends up giving us this nice end of year blog post talking about 4 ways that she's learned how she can streamline her our workflows. And each of these hit home with me, and I'll be curious, Mike, how much you've been using these in practice as well. The first area that she talks about, it may sound basic, but, boy, is it effective using template files. Because, you know, everyone does this. We have that set of scripts that we created. And then there just might be a new data type, but it's mostly the same. Maybe just a couple changes in variables or maybe a a similar service you're pulling from. And, yeah, you do the copypasta, as they say.
Well, why not make a more templated structure like Nicola did with her tidy Tuesday analysis script? So now she has a set of scripts that she can create dynamically based on templates for that current week of TidyTuesday and having all of her reusable functions for visualization and other aesthetics and data processing all set to go. And, yes, this blog post has links to the individual blog posts where Nicola talks about this in more detail, that you can check out. So that is a great, great way to do it. Low hanging fruit, as they say, to get you some much needed savings and time.
And speaking of templates, if you find yourself making multiple GitHub repos that look really similar in structure, especially beginning the first time, yes, I feel seen. I do this almost every day at the day job. You can leverage GitHub repository templates. This is huge because instead of, like, doing the blank repo, maybe getting your r env library in there, getting like your helper scripts for that data processing or whatnot, Why not start yourself on the right foot? Have a template structure in place. Nicole has been making great use of this in her workshop materials because she was quite busy in 2023 teaching some workshops.
I'm still very privileged that we had her at the our pharma series of workshops talking about machine learning, and she had a tremendous job with the material on the GitHub repository that she created. But, yeah, that has now started in her workflow from a GitHub repository template. I have also made use of this with my fancy schmancy Docker container setup. So now every time I know I'm going to do an R related project, I want to leverage Versus Code with the Docker container and the custom RStudio IDE server edition in the Docker container. Instead of like repopulating all those Docker files 1 by 1, I just have a repository template that I do for my new repo. Makes a heck of a lot of time savings. I only have to change a couple environment variables, and then literally, I can just bootstrap that thing in less than 5 minutes.
It is, again, a huge time saver for me as well. Now 3rd up in this list is one I need to be much better at. So I'm coming confession time with Eric here. I do not do a great job of linting and styling my code. Well, guess what? There are r packages that help you with this. In particular, the lint r package and the styler package. These can take away all those spacing issues, syntax errors that you might not detect until you actually run this. If you author Shiny apps, you know what I'm talking about. You're missing that bracket, you're missing that variable? You're in trouble kind of for that reactive. Don't get me started on that. But, yeah, it's not just finding those errors. It's also making sure that your code has a consistent style. And so Styler can be customized to meet your needs, but it's gonna get you off on the right foot very quickly.
And boy, is that helpful for multi team, multi person projects where you wanna make sure your whole team is having a unified approach on how you style the code. So I would definitely make use of Styler and also Linter to find all those nagging issues that you would manually have to detect yourself in the old days. And lastly, yeah, at the top of the show, I talked about quarto. Right? Well, quarto comes with a lot out of the box, and I do mean a lot, but there may be things that you wanna do to build on top of it. And so Nicole had an adventure in 2023 in their early part of the year on building a quarto extension to help extend the styling that she did for PDF reports.
This is really useful, especially if you wanna kinda get that similar flavor as you might have in the past with some of these custom R Markdown templates, but wanna have a similar thing with Quartle. These, ways that you can build extensions might be a path forward. It definitely does take a little getting used to. I mean, certainly, if you're comfortable with LaTeX, you're gonna be right at home with some of the styling for PDF output. But in general, I believe it's using Lua scripting, so you might have to do a little level up on that as you go along. But the linked resources on her more detailed blog posts can get you up and running quickly.
And also for a plug, I mentioned the R pharma workshops earlier. We had an R pharma workshop from Devin Pastore about building portal extensions. So I have a link to that in the supplements of the show notes as well if you want to really get into the weeds of building a complex extension. It's definitely something you might have to get used to. But again, if Kordle is not doing everything you want out of the box, you can definitely customize it. And I am a very big consumer of extensions, especially for the web based presentations of Reveal JS. Big shout shout out to Emil Hunchfeld who has built immense extensions such as, like, the code window, even that fun little confetti, extension. I'm not sure if he created that or is that someone else. But I'm I I make use of quite a few in the reveal JS space.
So, again, these are all great things that are attainable in your R journeys with making things a little more automated, a little more repeatable. And, yeah, Nikola, concludes the blog post of a little intriguing notes on what she's interested in. And, boy, she's also gonna be pursuing the Knicks package train as well. So Bruno is not the only one. I'm also trying to pursue this as well. So we might be seeing some blog posts from her about that. And, also, it looks like she's looking at Rust as well, another framework that's been getting a lot of attention in the R community these days. So fantastic way to inspire you at the start of this new year to maybe, supercharge your workflows a little bit.
So, Mike, what did you think of Nicola's,
[00:10:45] Mike Thomas:
blog post here? I couldn't agree more. I think there's a lot of tips in Nicola's blog post that would help us get off to a fresh start in the new year and start employing some of these best practices for ensuring that your workflows are consistent across projects and make you as efficient of a data scientist as you can possibly be. I think using template files is a great use case. I am so guilty of sometimes going from project to project and copying the majority of a read me file, for example, you know, especially where we talk about, you know, handing our projects off to clients and and how they can use the, you know, the RN package, that we have, you know, in our deliverable to essentially reproduce the environment that we created it with.
And, really, I think Readmes for us could be a template file and a great use case that a simple script like the one that Nicola actually has screenshotted here could help with that. You know, I know that there are packages like use this and dev tools and column and things like that that will actually create scripts for you. And the and the code is is pretty lightweight, so don't be afraid to to do it yourself. I think it's a really awesome trick that can be really underrated as well. Using GitHub repository templates, I will say, is one that that I actually do and that we actually do internally. And that is is hugely helpful. And I know, Eric, you do this, quite heavily, at least within your own, personal development work because you have, GitHub repository templates that spin up your your whole environment that make use of dev containers.
Right? So you can go essentially from project to project and have your your isolated environment and your development environment sort of consistent from project to project. And that is and that's really helpful as you go from project to project to ensure that all of the dependencies that you need to have, you know, within your projects that you're sort of consistently applying the same workflow to, can get installed quickly and consistently. I've seen a lot of articles lately as well from Rami Crispin, who is trying to, I think, accomplish a lot of the same things that you've talked about as well, Eric. And he has some great content online about how to set up your own, GitHub repository templates, I believe, for both our development and Python development using dev containers that have, you know, those those, dev container dot JSON files to sort of specify how you want your v s code environment to look, and then a Docker file to manage all of those dependencies.
And one of the cool things about that as well is if you're interested in leveraging, GitHub code spaces, you can essentially do that, apply that immediately just by clicking a button within GitHub, and it will leverage those dev container assets to spin up that environment in Versus Code in the cloud for you. You don't even need to have Versus Code installed on your own laptop. I know there's a cost associated with that, but I know it's it's pretty cool. And when we think about collaborating across teams, you know, these things become really important. And for us, that's not only collaborating across teams, but it's also handing off our deliverable to to the client at the end of the day and ensuring that they can reproduce our work, and and whatever we've created for them in terms of that deliverable exactly in within the same way that we intended it to be and the way that we created it. So that's really helpful.
And then the section on on linting and styling code, you know, it must be something in the water on this podcast, Eric, because I am also guilty of not utilizing the the linter or the styler r packages. I am very strict. We have an internal handbook about, you know, how to style, our our code and our our Python code, and we try to adhere to that pretty well. And obviously within our our code review process, you know, that that's a component of it to make sure that the code is styled, you know, in line with, sort of our expectations internally. But I was talking to a client recently who were trying to help, set up sort of a good team data science workflow and implement some best practices within their organization and really trying to convince them that code styling is is important for collaboration and streamlining code review, and making all of those processes more efficient.
And one thing that I, I think I failed to mention to them that I need to mention to them is how the lint are in the styler package can expedite that process and maybe also provide, safeguards to ensure that your code is styled in line with the guidelines that you've set up for your organization. And maybe that's not necessarily the default styling that these packages have, but I know at least within 1 or or both of these packages, you can actually apply, you can apply some settings to how you want that code to be styled. And you can make some changes to to how you want code to be styled internally at your organization or or maybe just across your personal development projects. And I think that that is incredibly powerful and incredibly underrated, which I think is a theme of this blog post as well.
I love the section on building quarto extensions. You know, quarto is super hot in the streets right now. We are moving everything to quarto. All new projects are starting with quarto. One thing that that I love as well that I think goes, very aligns very well with, Nicola's section here on building quarto extensions is we unfortunately deliver a lot of PDF reports. And that means that we, use LaTeX quite a bit. In the quarto YAML file, you can actually set up that YAML file and set up, your your latex assets along with that, your dottex files, such that you can have parameters within these latex, assets.
For example, like the title of your report, a background image in your report, subtitle, things like that, that may change on a project to project basis. And you can specify those parameter values within your quarto yml file. And those will get passed to those Latex files. We do that on every single project, you know, because the project title is gonna change. Maybe the image that we'll have on the cover page is going to change. Things like that. But it's it's amazing how well those things, play nicely with each other. So if you're someone out there who finds yourself, you know, delivering a lot of PDF reports through quarto, I would recommend checking out some of the links that Nicola has in this blog post. If you you really wanna spruce up the look and feel of that PDF report, then I think learning a little bit of Lua, can go a long way as well. But there are ample links here, as well as on the quarto website. I think that can help you really make that quarto PDF Latec report, look, you know, nicely branded to your organization or look and feel however you want it to look and feel and customize it nicely. And Nicola gives us a little bit of a preview into some of the the work that she's, posted for herself in in 2024, and I can't wait to see, what comes next from Nicola because it's it's always super relevant, always super helpful, and and I learn a lot.
[00:18:03] Eric Nantz:
Me as well. And there are just so many threads here that I wanna implement both in my open source work and in my day job work. In fact, we're even thinking of, for the internal user group, I maintain, of having some kind of, maybe quarterly newsletter or bimonthly newsletter that can go out highlighting the current events and highlighting in the company initiatives. Maybe I could build that with Chordle, and then I saw a PDF setup like she's done in her earlier efforts. And, yeah, the linting thing, I really gotta get better with it. What's interesting is that in my dev container setup, I do have a hook to do linting automatically in Versus code with a dotlinter, kinda config file. So I'm I'm getting there on the right track. I just have to actually listen to the advice. But I will say the Versus Code is very obvious when it thinks you did a line longer than 80 characters. It will have a big old squiggly next to it, and it will annoy you enough that you probably will wanna change it. But certainly, I wanna make that more automated too and just get better habits. But, yeah, though those two areas and and, yeah, beefing up my templates in general.
Even just today, I have a project where I want to enhance my shiny bookmarking state hack that I've done at the day job. Instead of having to build that up scratch in each app 1 by 1, I want to do a template structure. I can just fold that in because I can't quite make it a package yet, but I just want those functions there every time. So I don't have to just manually copy paste it from that existing repo into the new one, change the path, change whatever. No. Ain't nobody got time for that in 2024. So lots of things here I wanna make use of and really appreciate, Nikola's efforts on this. Couldn't agree more.
And we're gonna shift gears a little bit to a very interesting proposal to maybe a future enhancement to the r language itself that is in the spirit of a recent enhancement that came a couple years ago. And that is making a case for a new pipe assignment operator in R itself. This blog post comes from David Hugh Jones. He's been a very active blogger in the R community, and he makes this case by talking about just kind of the status quo of how R does assignments. You can either have, you know, passing by the value itself, you know, very much saying object, whatever, assign it to, you know, the result of another function or give it a constant or whatever have you.
Sometimes it gets really wieldy when you have a variable that, of course, is being built upon a chain of functions. Right? This is one of the motivations for the pipe operator itself, which, of course, was first brought to light by the Magritter package. And now since r4.1 now has the base r pipe operator built in, which functions largely the same way. So you can construct your, say, data processing steps in a pretty nice kind of streamlined workflow using packages such as dplyr and many others that are pipe friendly. Well, David wants to take this a little up a notch a bit more. What was in the Magritter package? It didn't get a lot of press at the time.
Was it had its own version of a pipe assignment operator where it looks like you're going to do processing on a variable and then print it out kind of interactively. No. It would actually look like you're doing this interactively but actually change the value of the variable you had on the left side. So David is proposing why not do this in our proper. And he proposes this operator as like the less than sign pipe and then greater than sign. So it looks like it's a pipe surrounded by the two signs and then having that translate to a bunch of functions potentially to do something with that operation.
And his rationale is that it could make the code much simpler to read and more expressive and being able to kind of be more succinct with the syntax that you're typing. Now, this is where it gets interesting here. It's one thing to kind of conjecture this, but he decided to take matters in his own hands and actually look at what is being used in the general, you know, set of code that's being shared in open source of where this pipe assignment could be useful. So he actually has authored his own new package called code samples, which apparently has scraped open source code that has been shared on GitHub and Stack Overflow, as well as just the R package examples themselves that come from packages.
And he's done a nice little summary here in the blog post about how many times he has detected code that is pipeable but also pipeable and complex, having like more than a few lines in the pipeline operation. And so the post says there's about between 4 10% of operations could be simplified by an assignment pipe here, and that could be a potential gain. So you might want to look at this for yourself to kind of see in your code, do you have examples that match some of David's examples here, such as what he has from the Stack Overflow questions. Looks like a pretty unwieldy sub setting of a w variable and then trying to translate that into a more friendly, pipeable assignment operator, where it basically streams it down to 4 lines of code, technically, instead of what looks like about 16 or 18 in the previous example.
I will admit I've had a little bit of, gotchas when I've tried this in the Magrita approach before, and I admittedly moved away from it. But I can see where David's coming from here. So it is an interesting proposal. Now for those that are new to the r, you know, project itself, it does take a bit of time for things like this to land in. And, of course, of course, it has to be agreed upon by the r core team and whatnot. So there would be obviously a lot of discussion in play and that this would not happen anytime soon if it was to actually take shape.
But with that said, it is interesting to see what this could look like. So I credit Dave for taking the time to not only put this out into the public domain, but do some interesting kind of data scraping analysis of existing code to see where this might benefit. So definitely something to ponder. I'm kind of in between on this one. I don't really wanna say no to it, but I'm just not sure if I would be the best customer for it. So with that said, definitely food for thought, as I say. Eric, you know, this is one of those things where at first glance, it kinda gives me the heebie jeebies,
[00:25:03] Mike Thomas:
but then it might be, you know, this gets implemented, you know, a couple years from now, and then I finally adopt it, and then I feel maybe someday that, you know, like, don't know how I lived without it in the 1st place. Yeah. I think that happens occasionally. You know, I will give David a lot of credit because he he absolutely makes the case for it in this blog post. Doesn't just make the case for it. He actually writes the code to implement it in, Rsource. So he has, forked the Rsource repository, and then he has has made a commit that, you know, is 1200 lines of code, c code, I think, mostly, essentially, that that implements it and, kudos to him. Hats off. He's putting his money where his mouth is, and I think, absolutely, if this does get adopted, if it's something that the community does eventually want, then he will be the person essentially that that spearheaded this and that did a lot of the work I think to get it in to ours base code. Yeah. I think that probably that the base pipe itself for now is maybe still getting adopted. I would love to see some statistics on how widely used it is. We do still see a lot of code, you know, at our clients that contains the the Maggard or pipe, even new projects that they're working on.
I think maybe people are just just slow to adopt new things so I would imagine that it probably wouldn't be any different if this new pipe, did get implemented. And I can't help thinking that it looks like a person standing there with their hands on their hips. That's the best way that I would that's the best way that I would describe it. Always have to sort of see something, you know, it's like seeing something in a potato chip, but, when something new comes out. But I think it's super interesting. I have a little bit of trouble, you know, the explicit assignment with the arrow is something that I think helps me for code review to be able to really clearly see when an object is getting created.
I know that there are some folks out there, probably more in the in the Tidyverse community, that that, warn a little bit against the use of the assign function, for that that same reason because it's not sort of explicitly showing you where new objects are getting created in your environment as as you read some code. It's that's a function that I've tried to get away from, in in more recent years. So it's gonna be teach their own. I I would say if this is something that that you really feel could help you, then then get in touch with David and and support the cause. And I guess the great thing about open source is is you can use what you wanna use, and you don't have to use what you don't wanna use. Right? I'm sure the assignment arrow is not going away anytime soon. So I'm not very worried, but the, you know, the only place that I think it goes back to maybe our our discussion on Nicola's blog post is is adopting some style guides and sticking to them. Right? So if you want to adopt, this new pipe, if it ever gets merged, as sort of the way that you are going to go about assigning objects internally within your organization, then by all means, you know, merge that into your own style guides and into your own consistent practices. But I think consistency is important. I think ensuring that you are writing code that sets up the reviewers or the collaborators on that code, for success and to make their life easy and efficient, is really important. So that's that's really all I have on this blog post, my 2¢.
I don't feel super strongly one way or the other. I think it's it's really, really interesting. And again, kudos to David for not only making the case, but for going through, the entire process of actually implementing it and what that might look like.
[00:28:57] Eric Nantz:
Yeah. Definitely credit to him for, you know, following through on what this actually could look like. And, you know, I'm always I will never turn down having choice in this space. I mean, there may be others that would have such great benefit of a complex pipe assignment. Have at it. Right? I mean, I don't have to use everything that's in our core or these packages that I use. I just use what's best for the project and what's best for me. So I I will admit it would make me probably have a little harder time doing code review of it as of now. But, heck, maybe a practice makes perfect. Who knows? But, yeah, the assignment operator, you're gonna have to pry that out of my cold dead hands. I use that every single time, and that helps me reason out kind of where the key variables in this pipeline.
And, also, to me, it helps make debugging a little easier at the sake of being a little more verbose. For me, debugging, reviewing are probably the one criteria I'm gonna use as I think about whether I would implement this. But, again, I'll be very interested to see what the community has to say. So I'll be keeping an eye on, say, Mastodon and other, social areas to see what kind of, discussion this, spurs on, but we may be hearing more about this way earlier. Who knows?
[00:30:10] Mike Thomas:
Keep your eye on it. You never know.
[00:30:12] Eric Nantz:
Yes. Yes. And, one as we get to our our last, highlight here, let me preface this by saying when we talk about highlights, we're really more that's a general term for the the areas that we think are most newsworthy and that will probably spur the most discussion in a particular issue because I will admit on the surface, when you when we talk about this last post, I would not call it a highlight per se because we're about to get a little heavy here. But I think there are some thoughts that Mike and I definitely wanna share about this. So first, if you have been using r and in particular frameworks like r markdown for any amount of time, I think if you're not new to the community, you probably know who is most directly responsible for this.
We have to thank for all these amazing innovations in Rmarkdown alongside knitter itself, which honestly made R Markdown possible. Knitter, for those that aren't aware, is kind of the engine behind frameworks like R Markdown in the spirit of sweave or swive that comes built into R itself. But, admittedly, knitter is, in my humble opinion, much easier to build upon, much easier to customize. And once r once the r markdown format came to be, there are just so many lives that have been transformed in professional development, personal development, sharing your data science, you know, blogging with R Markdown and hence, blog down. In essence, what we call the down verse, you might say.
EWAY is directly responsible for this. EWAY had a blog post early this year, just a week ago. And I got wind of this from a, somewhat random post on Mastodon, and I almost didn't believe it. What has happened is that Eway was, unfortunately, given notice. And, unfortunately, posit has, decided to basically, take away his full time role, albeit to pay for the support on a contracting basis of the packages that the R Markdown ecosystem depends on. So, like I said, aforementioned like Knitter, R Markdown itself, and the like. This has come to a shock to many of us in the community, and frankly, it sounds like a surprise to Eway himself in this post.
But to Eway's credit, he has been very gracious in his response to this. He has been very cordial on acknowledging, posit for for all the years that he's been able to work on Knitter, R Markdown, and some of the newer team members that have really stepped up to help him. He he names a few names, but there certainly, there are many people that he has collaborated with, that have made the R Markdown, you know, this EwayDowns ecosystem so powerful in the in this space. So he also mentions that, as I mentioned earlier, the packages that he's been directly responsible for are not gonna be orphaned because if they have anything to do with our markdown in knitter, he is being paid to support that. It's just obviously not a full time job pay anymore.
So the one exception to this would be DT. He is looking. It sounds like Paz is gonna find someone new to that package, but, again, that this is just another, you know, consequence of this. The other interesting part of Eway's post is something I've been kind of observing a little bit. And as someone who has been a fan of, say, Linux and Unix for how many years, it definitely rings true to me a bit. Eway has been exploring kind of a more minimalist approach to some of the software development. I've seen some of the newer packages or newer utilities he's been spinning up. In fact, someone called TinyTek, a way to get latex installed in a very streamlined way onto your system without going through the full bloated like latex system, things like that. And he's also experimenting with other, areas in this, and it sounds like it's more of a philosophical shift.
He's not saying that either approach of, like, a minimalist approach versus an approach like quartile or even Shiny itself that tries to encompass a lot of things. I think they each have their value. Right? It just depends on your philosophy and where you want to take your development. Now, really, the part that hits home here is that it does sound like this was a bit of a surprise. And EWAY, for those that don't know, is a father. He's married. He's got his own house. He is definitely supporting his family. So this is now, to be perfectly frank, an uncertain time for him.
So he has put at the end of the blog post he doesn't do this, in fact, hardly ever. I don't remember last time he's mentioned this, but he did say that until he's able to land on his feet with, say, a new role in whatever industry he chooses to be, He has asked for a bit of help because of the little concern he has with this big change, and he does have a GitHub Sponsors page. Mike and I, full disclosure, are both sponsors of his work. I'm gladly so because we value so much of what he's done for us. But I will say after this post made the rounds, he has received just an earth shattering amount of acknowledgments in the comments of this post. I think we have over 300 or so, at least over 100. It has gone triple digits since I commented on it. And this is just really if you ever had any doubt of what transformative effect Eway has had with his efforts in the R community, you just read the comments on this. There are so many that have said, we owe Eway so much because of our markdown, what it's done for reproducible research, what it's done for their reporting, what it's done for their being able to connect with the community. Like I said, LogDown.
I use LogDown. That's how my podcast site was built, for goodness sakes. Like, there are so many ways that we have been leveraging his utilities. So I will admit I am frankly disappointed at the way this came about. I am, again, giving full credit to Eway for being so gracious in this post. But, certainly, if you've gotten any value out of Eway's open source work, I think everybody makes their own decisions. But to me, the effects he's had on my development, our journey, my adventures in the community. He was my first interview on the our podcast, for goodness sakes. He actually said yes. I put him into a room at this local conference called MBSW with my half answer mic set up, and he talked with me.
And he was so gracious. I felt like I was, you know, meeting Yoda's Luke Skywalker kind of thing and I'll never be at Yoda's level, but I'm just saying that's how I felt. And I just and then subsequent interviews, he has given me some of the most candid thoughts I have ever had on that show is from Eway. So I feel personally very, you know, I consider him a good friend. Obviously, we live far apart. We only get to interact briefly, but he has done so much for me personally. But if you've had any benefit from me, Haysworth, I would just at least consider helping him out in this current time.
But I share what everybody said in in the responses of this, both in the post and on X and Mastodon. So many people's lives have changed because of what Ehue has done. So I sincerely hope he lands on his feet soon. We're thinking of Ehue. If there's anything we can do, obviously, we are we are here to help. But, best of luck to you. But again, the post has so many thoughts that came to mind after reading it. It definitely, like I said, not a highlight in the traditional sense, but the the impact the UA has had on the community cannot be overstated. And I think this blog post clearly shows that. So that that's a lot for me. Mike, what do you got?
[00:38:40] Mike Thomas:
Yeah. I guess I'll start with maybe a couple of calls to action. And the first call to action I would say is is if you have benefited at all, especially financially, you know, if your salary, what you do for a living, includes, you know, leveraging our markdown or ever included leveraging our markdown, and you essentially were paid to to use our markdown, which is obviously free, I would have a hard time, you know, not justifying sponsoring e way, you know, in some way. I know a lot of people have, you know, including you and I, Eric, but the the value that and I I don't think it's unfair to say, you know, monetary value that the community has has probably gained in terms of a lot of employed data scientists, getting a ton of value from our markdown.
I'm not sure that's even quantifiable. So if you can, I don't think there's there's any better use case for making that charitable contribution, to Eway at this time and, you know, his journey? He is, so so that I guess that's my first call to action. And my second call to action would be if you have an opportunity within your organization, whether it be a contract opportunity or or an opportunity to where where you think, you need someone with his skill set as a as an incredible software engineer, reach out to him. Go to rweekly.org. Check out this blog post, that's called by rstudio/posit.
And I'm sure his contact information is at the bottom, of this blog post. I know his GitHub is. He has an about page in that blog post with a contact me link where you can get a hold of him. So if you have opportunities for Eway, reach out to him and see if they may be interested, you know, because he is, he is not necessarily just this miss mythical Yoda. He is a person, you know, with a family as well. And I know at this point in time, especially within tech, there's a lot of, folks that are probably experiencing similar things depending on, you know, where you work.
So it's, it can be a tricky time and unfortunately, you always been been bitten by it as well. So let's, let's try to lift him up, you know, because he has given so much to us, and those of us who are data scientists that use the R Markdown ecosystem and PageDown and BookDown and all these all these different utilities. So I guess those are that's what I wanted to start out with, is a couple calls to action. If you haven't yet really considered sponsoring him or reaching out to him with opportunities if you you see them. Second one, it's it's hard, Eric, not to be emotional, about this. You know?
And layoffs in in business are a reality. You know, one thing that I think is is disappointing to me, and I I don't think I'm I'm being too frank here, is is that it seemed like it was it was quite a quick surprise to Eway after someone who had had worked there for for 10 years and really given so much to, what our studio has been able to create. I have to imagine that, you know, while I understand that a lot of folks worked on on quarto, I have to imagine that a lot of it stands on the shoulders of what, Eway built within the R Markdown ecosystem for many years. So that's that's really disappointing that it it sort of came as a surprise to Eway. I'm very glad that they are at least employing him as a contractor to to be able to work on some things. You know, I guess guess the reality of business is that, you know, if you don't have enough work for someone, then you then you probably only want to pay them for the amount of work that you have for them. So I I don't know if the transition to quarto means that there's there's less work for e way to do on some of the software that he was maintaining and working on. That's just, I guess a guess, and we don't necessarily have that information at this time. But it's it's emotional, I think, more so because of the way that posit's structured as well as a public benefit corporation and not just, not just the regular corporation.
Right. And it's that creates sort of this dynamic between posit and the, our community. That's unique. Right. So I think when, when something takes, when changes take place at posit, we feel the effects of that sort of personally in a way. And that's, that that's very unique. And I think. Open source is really rooted in transparency, right? When we're doing open source work, you know, the, one of the benefits and the really cornerstones of open source software development is that others can see exactly what we're doing and and what we're working on, and they can contribute to it and and try to help and things like that. So I I guess I would like like to some of these changes, for lack of a better word, to be a little bit more transparent within the community. It it feels like maybe there are some walls being built up, between the community and and posit at this point that may not have have previously existed. And I think things that come out like this, you know, I think the loss of a lot of our, posit academy folks, you know, as well as e way now.
I think there's not a lot of, acknowledgement necessarily of it, except maybe by the folks who are directly affected by it. I think it's, it's unusually quiet, from, from their perspective. So I guess I would like to get a better understanding of, of sort of the directions that, that things are going, a better sense of transparency, just because of this, this really unique relationship that the art community has with, you know, what used to be our studio and what's now posit. It's it's something that, you know, is emotional, I think, to a lot of us. I don't think I'm just speaking for myself. So, you know, this was a tough blog post to read. I think EWAY may be taking it better than some of us reading it. He he's taking the high road to every extent here in this blog post. He's extremely grateful, to the folks that he worked with and who employed him for the past 10 years.
He he talks about in the the comments of the post that the in Chinese, the word crisis consists of the words, danger and opportunity. And he he's optimistic about the opportunity part, and he's not very concerned about the danger, and and he really believes it's going to be a blessing in disguise. So I guess if nothing else, I really hope that Eway, you know, finds a new role for him that he's excited about and loves, and and that he he really lands on his feet and and finds, you know, a next great opportunity for himself. So I think we we have a lot of thoughts on this blog post. I'm glad that we we we talked about it. It is, you know, emotional to me. That's that's sort of the best word that that I can use here to try to characterize, how I feel about this blog post. And I I think what Y Hue's work means to us and and what him as a member of the community, means to us.
[00:46:05] Eric Nantz:
Yeah. I echo a lot of that. And you are not alone. And I've seen posts on Mastodon of others kind of a little concerned about, I would say, like you said, the silence on some of these maneuvers because as those aren't aware, we did talk last year. POSIT did do some layoffs that affected some of their open source division as well as other divisions. And, yes, you might be able to hear from other people affected by it directly, but there wasn't a lot said on the, you know, the public facing kind of communication on that. And as of now, I'm not seeing any new response to this either.
I do think this may be a wake up call in a couple senses. Yes. There is an interesting dichotomy where unlike most of the tech industry, posit is a PBC. That puts you into a different lens, in my opinion. Whether you consider that right or wrong, they chose it. Right? They chose to be a PBC. That was JJ's vision that we heard in the keynote a couple years ago at at our studio conf. With that, I think there is and again, Eric's going on his soapbox here. I think there we are owed a little bit more transparency on this as those that are not just fans of open source, fans of data science, but this tooling is immensely important to the work we do, and especially in the open source side of it.
Yes, of course, the commercial projects help, too. But you know, for those that don't know where, Quarto, when it compiles, anything to do with our execution chunks, that's using Knitter under the hood. Guess who wrote Knitter? EWAY. So, like, Quarto is not possible on the r side of it without what EWAY did. So whoever paused it is now funneling more resources into that development, Fair Play, it's their company. Obviously, the interoperability is a big focus for them now. But at the same time, they are indeed standing on shoulders of absolutely giant efforts that EWAY has built here. So, again, I do think a little more transparency is warranted here.
And, you know, it did it has spurred on some concerns about future directions. Again, that could be a whole another podcast in and of itself. I think we'll just leave it as we're interested to see what 2024 holds. But, again, full marks, full credit to EWAY for taking the high road on everything here. And, again, the response from the community has been eye opening to say the least. I misspoke earlier. I mean, ever since the blog post, he has been hearing from many, many people, familiar names, new names, those that maybe are what I call dark matter developers and may not comment very much when they see somebody that has that impact on their daily work or their daily data science journey. They're coming out to say their thanks. So I do think just as open source in general, we need to do a better job of thanking contributors, not just in times like this, but regularly throughout the year because it can pick you up. If if you're an open source developer, you're having a rough time trying to maintain this, just having that pick me up really helps too. We don't just have to wait for an event like this. But with that said, Folker at the Evway for being gracious in this, and I would imagine he is going to be hearing from a lot of people about future opportunities.
I hope he takes what's best for him and his family and certainly will be very curious what the future holds for him. Well said, Eric. Well, yeah. It's hard to transition from that, but that was a jam packed, summary of our, our highlight stories here. But we're gonna close out with a couple of additional fines that we found in this issue. And, yeah. Well, I heard you when you heard me mention in in Nicola's, segment that, you know, she's gonna be pursuing Knicks as a as a tie in with r. Well, of course, our good friend Bruno Rodriguez has been continuing his journey with the NYX, package system and r, and he has a part 8 of his reproducible data science with NYX blog series that gets more into the kind of the the into the weeds of just how open source plays a critical role in Nick's itself.
And in fact, every package in Nick's is simply I'd say simply. I should not say simply. But it is a set of scripts that will take the upstream utilities, bundle it up into a way the next can understand, and then we can install that via the next package manager right then and there on the spot. And, of course, through, Bruno's work on the Rix package, he is trying to make that easier from within R itself. But acknowledging that many R users are very comfortable with the RStudio IDE, he's been trying to make that part of the Rix packaging process too, the Git, version of RStudio in that reproducible project.
So that then is tying into that custom R installation, that custom set of packages that that next package system is exposing. Well, RStudio is a bit of a finicky piece of software to compile and run and install manually, especially around that little thing called macOS. It's a bit wieldy. So in his post, he actually talked about setting up a GoFundMe to help pay for a specialized vendor to assist with that part of the process for macOS users. Unfortunately, it did not hit the funding. But to Bruno's credit, he has actually donated those funds that were, contributed back to the R Foundation itself. So fair play to Bruno for at least helping the R Project benefit from that request.
But I'm gonna conclude this additional fine summary back to Pazit for a second. Pause it? If you don't know now, you know that nicks is becoming a thing in data science. It's not just Bruno. Many others are pursuing this too. It's your ID folks. Maybe you could help out a bit on this too. Just saying. I'll leave it at that. Mike, what do you got?
[00:52:31] Mike Thomas:
There's a call to action. So that's a that's a great find. I still gotta get my hands dirty with nicks at some point these days, and I know Bruno has ample resources available to help me do that. One blog post I found that I think caught a lot of fire this past week and was really interesting was from Emily Riederer on Python Argo nomics. And it talks about essentially if you are an R developer needing to switch to Python for a particular project or, just trying to learn a little bit of Python, to to keep up. Some it talks about how to, map some of the concepts from R into Python to get you up and running. And just sort of as a quick highlight, some of the the tooling that Emily recommends you use to get started with Py, with Python.
For installation, she recommends pyenv package, which allows you to switch back and forth between different versions of Python really easily. So from on one project, you need to use Python 39, and the next project you need to use Python 310. Py n allows you to really easily, switch back and forth from the command line, I believe, between those 2, or multiple different versions of Python. You can set one to be sort of your your global version of Python. I believe that you want to use, as a default. For data analysis, we have heard a lot about the Pandas package, sort of being the equivalent maybe to dplyr. Emily argues that the polars package actually has a syntax that is more similar to dplyr. I agree as well. It's also more performant.
Polars is is a really, really cool package, and it's it's really, really efficient for working with data, large data in particular, but small datasets as well. I think it it works great for, and I think you'll find that the syntax looks very similar to, what you would find in DeepBlider verbs such as select, filter, group by, that they're all mapped sort of between DeepBlider and polar, so it should be pretty familiar to look at the code there. For communication purposes, she references the great tables package, in Python, which is the port of the GT package, by Rich Iannone, I believe, at at posit, from r to Python. It's very new. So if you're a Python user looking to author, really nicely formatted tables within your reports, I would check that one out. And then notebooks, obviously, quarto.
And lastly, in terms of environment management, I think she references the PDM package in Python as maybe being the one that would be equivalent to the r end package, or or the one that she prefers the most. So it's a great sort of list of the different tools that Emily uses to get up and running across all of her Python projects, as well as an explanation of of why she uses them and how they may, be similar or differ from your experience, using similar tooling in r.
[00:55:34] Eric Nantz:
Yeah. Emily does a fantastic job here. This is something that I've struggled with immensely. Even just knowing where to start, I I have a couple of projects that because I'm extending a package in Python for some RSS feed stuff, I'm probably gonna stay in that ecosystem or some of my, you know, very, you know, very in-depth kind of podcasting 2 point o expirations. So having kind of this familiarity of knowing where to go to maybe summarize that podcast database effectively in my portal notebook, go with these packages that give me that kind of our flavor a bit. For somebody that's new to Python, I think that is that is extremely helpful. So this should be your go to post if you're like me and you're just dabbling your toes in the Python for some interoperability work without it being your full time focus. Just knowing where to go first is immensely helpful because there's a wealth of choices out there. And you can go down rabbit holes on environment management and, frankly, bang your head against the wall. I can't tell you how many times I've bought projects at the day job because of the Python VM setup gone horribly wrong. Don't give me a start. Anaconda.
There'd be dragons on my HPC system with that. So, yeah, I'm sure I'm gonna be playing with a lot of what, Emily is recommending here quite a bit.
[00:56:55] Mike Thomas:
Dev containers.
[00:56:57] Eric Nantz:
To have containers. Yes. Exactly. I do have hooks in that in mind, if only the rest of my projects could benefit from that. Well, as you can tell, we've had a we've had a fun banter here. You can tell it's been a few weeks with Mike and I, so we came with our we came with our opinions as they say. But, of course, there is much more to rweekly itself than us just bantering. The full issue has so many more great content, lots of new packages and updated packages, some that really caught my attention on the visualization side of things, especially. So definitely have a check at that at rwig.org. Also, we love to hear from all of you in the community.
And going back to what we mentioned with VeeWay, I'm gonna make this call out now. And I will admit we don't get the whole boost thing very often on this show. I hope that changes from time to time. But for the month of January, if any of you are gracious enough to boost this show with your favorite podcast app or on the podcast index itself, which I have linked to in the show notes, I will funnel that directly to Eway for the month of January. So if you're interested in supporting Eway in a different way, the boost would be a way to I will personally make sure that happens.
So definitely keep that in mind, but, also, we just love hearing from you in general. We have a contact page on the episode show notes, and, also, we are, somewhat active on the social media spheres. I am more, active on Mastodon. My handle is at our podcast at podcast index dot social, sporadically on the x thing of at the r cast, and I'm also on LinkedIn from time to time. Mike, where can they find you?
[00:58:32] Mike Thomas:
LinkedIn is probably the best place to find me. I think my sort of tag there is Michael j Thomas 2. You can also find me on mastodon@[email protected].
[00:58:46] Eric Nantz:
Very nice. Very nice. And like I said, it's great to be back in the swing of things with you. It feels more normal again as we kick off the month of January with this, supersized episode that we just gave you all here. Well, that will do it for us. Like I said, we came with our opinions. Hopefully, you enjoyed it. We'd love to hear from you, but we will be back with another episode of our weekly highlights next week.
Hello, friends. Did you miss us? Yes. The R Weekly Highlights podcast is back and it is 2024. We are kicking off the new year in style. We have a supersized issue to talk about. But if you're new to the show, this is the show where we talk about the latest our weekly issue that's curating the latest and greatest in our content, whether it's, adventures in data science, new packages, tutorials, and much more. My name is Eric Nantz, and I'm always delighted that you joined us from wherever you are around the world. And I certainly hope you and your loved ones, family, friends, all had a safe and relaxing holiday season, whichever ones you celebrate.
Yeah. It's hard to believe it's 2024 already.
[00:00:45] Mike Thomas:
But just like last year, I don't do this alone. I got my awesome co host, Mike Thomas, joining me once again for another year of our weekly highlights. Mike, how are you doing today? I'm doing well. Eric, I'll be honest with you. I missed this. I hope the audience missed it as well. I missed it just as much as you folks. So I am so so excited that our weekly is back, that we had a great curator this week, and, the 2024 is hopefully now off and running for us here on our weekly.
[00:01:16] Eric Nantz:
Yes. It is amazing how this makes me feel a little more normal again going back on the mic with you, so to speak. It was definitely a busy break for me, but, yeah, it's great to great to get things back in motion on the our side of it. And, yeah, that curator, that wacky individual, yeah, that was actually yours truly this time. And I really went above and beyond in a different sense, and this is totally unrelated to the issue itself. But I figured I've seen some cool stuff built with Quarto lately. I saw the new dashboards. We've actually talked about on this very show recently. I thought, you know what?
For our curator team, we often have to share this rather cryptically formed URL of our public kind of curation calendar. I hosted on Nextcloud, which is great, by the way. Nothing against Nextcloud. But the URL they give me for sharing, yeah, that's just not gonna be easy to remember. So I thought, well, wait a minute here. Guess what? Because of my streaming adventures years ago in making what I call the streamers calendar, I know there are some HTML widgets out there that could play nicely with Cortl to render this same calendar, but, hopefully, a more friendly dashboard and give us both a calendar view and kind of a more tabular view of who's on which week for the curation.
So, yeah, I took a few days. I now have a portal dashboard of our public curation calendar, which I'll link to in the show notes. It's hosted on GitHub Pages, but it takes heavy inspiration from Garrigate and Buoy's Norfolk data portal dashboard. And I figured, you know what? If he can do it, so can I? So that is now publicly available. I have a link to it in the supplements of the show notes here in case you wanna have a look, but it was my first adventure with portal dashboard. So after I got that squared away, yeah, it became down the business and curating this week's issue.
And also, this is, like I mentioned, a supersized issue because we were off for a few weeks as a whole team. So there are a collection of stories here from previous weeks that would have been released if we weren't on break. So either way, buckle up. We got a lot to talk about, and there's a lot to read after the show too. But I can never do this alone. I have tremendous help from our fellow Rwicky team members and contributors like you around the world with your awesome poll requests to make all this happen. And we lead off with kind of a nice kind of retrospective like post of of areas that we've talked about last year, especially with respect to how you can streamline your R workflows.
And our our author this week is the esteemed Nicole Rennie who has been a frequent contributor to our weekly highlights in the past. And she ends up giving us this nice end of year blog post talking about 4 ways that she's learned how she can streamline her our workflows. And each of these hit home with me, and I'll be curious, Mike, how much you've been using these in practice as well. The first area that she talks about, it may sound basic, but, boy, is it effective using template files. Because, you know, everyone does this. We have that set of scripts that we created. And then there just might be a new data type, but it's mostly the same. Maybe just a couple changes in variables or maybe a a similar service you're pulling from. And, yeah, you do the copypasta, as they say.
Well, why not make a more templated structure like Nicola did with her tidy Tuesday analysis script? So now she has a set of scripts that she can create dynamically based on templates for that current week of TidyTuesday and having all of her reusable functions for visualization and other aesthetics and data processing all set to go. And, yes, this blog post has links to the individual blog posts where Nicola talks about this in more detail, that you can check out. So that is a great, great way to do it. Low hanging fruit, as they say, to get you some much needed savings and time.
And speaking of templates, if you find yourself making multiple GitHub repos that look really similar in structure, especially beginning the first time, yes, I feel seen. I do this almost every day at the day job. You can leverage GitHub repository templates. This is huge because instead of, like, doing the blank repo, maybe getting your r env library in there, getting like your helper scripts for that data processing or whatnot, Why not start yourself on the right foot? Have a template structure in place. Nicole has been making great use of this in her workshop materials because she was quite busy in 2023 teaching some workshops.
I'm still very privileged that we had her at the our pharma series of workshops talking about machine learning, and she had a tremendous job with the material on the GitHub repository that she created. But, yeah, that has now started in her workflow from a GitHub repository template. I have also made use of this with my fancy schmancy Docker container setup. So now every time I know I'm going to do an R related project, I want to leverage Versus Code with the Docker container and the custom RStudio IDE server edition in the Docker container. Instead of like repopulating all those Docker files 1 by 1, I just have a repository template that I do for my new repo. Makes a heck of a lot of time savings. I only have to change a couple environment variables, and then literally, I can just bootstrap that thing in less than 5 minutes.
It is, again, a huge time saver for me as well. Now 3rd up in this list is one I need to be much better at. So I'm coming confession time with Eric here. I do not do a great job of linting and styling my code. Well, guess what? There are r packages that help you with this. In particular, the lint r package and the styler package. These can take away all those spacing issues, syntax errors that you might not detect until you actually run this. If you author Shiny apps, you know what I'm talking about. You're missing that bracket, you're missing that variable? You're in trouble kind of for that reactive. Don't get me started on that. But, yeah, it's not just finding those errors. It's also making sure that your code has a consistent style. And so Styler can be customized to meet your needs, but it's gonna get you off on the right foot very quickly.
And boy, is that helpful for multi team, multi person projects where you wanna make sure your whole team is having a unified approach on how you style the code. So I would definitely make use of Styler and also Linter to find all those nagging issues that you would manually have to detect yourself in the old days. And lastly, yeah, at the top of the show, I talked about quarto. Right? Well, quarto comes with a lot out of the box, and I do mean a lot, but there may be things that you wanna do to build on top of it. And so Nicole had an adventure in 2023 in their early part of the year on building a quarto extension to help extend the styling that she did for PDF reports.
This is really useful, especially if you wanna kinda get that similar flavor as you might have in the past with some of these custom R Markdown templates, but wanna have a similar thing with Quartle. These, ways that you can build extensions might be a path forward. It definitely does take a little getting used to. I mean, certainly, if you're comfortable with LaTeX, you're gonna be right at home with some of the styling for PDF output. But in general, I believe it's using Lua scripting, so you might have to do a little level up on that as you go along. But the linked resources on her more detailed blog posts can get you up and running quickly.
And also for a plug, I mentioned the R pharma workshops earlier. We had an R pharma workshop from Devin Pastore about building portal extensions. So I have a link to that in the supplements of the show notes as well if you want to really get into the weeds of building a complex extension. It's definitely something you might have to get used to. But again, if Kordle is not doing everything you want out of the box, you can definitely customize it. And I am a very big consumer of extensions, especially for the web based presentations of Reveal JS. Big shout shout out to Emil Hunchfeld who has built immense extensions such as, like, the code window, even that fun little confetti, extension. I'm not sure if he created that or is that someone else. But I'm I I make use of quite a few in the reveal JS space.
So, again, these are all great things that are attainable in your R journeys with making things a little more automated, a little more repeatable. And, yeah, Nikola, concludes the blog post of a little intriguing notes on what she's interested in. And, boy, she's also gonna be pursuing the Knicks package train as well. So Bruno is not the only one. I'm also trying to pursue this as well. So we might be seeing some blog posts from her about that. And, also, it looks like she's looking at Rust as well, another framework that's been getting a lot of attention in the R community these days. So fantastic way to inspire you at the start of this new year to maybe, supercharge your workflows a little bit.
So, Mike, what did you think of Nicola's,
[00:10:45] Mike Thomas:
blog post here? I couldn't agree more. I think there's a lot of tips in Nicola's blog post that would help us get off to a fresh start in the new year and start employing some of these best practices for ensuring that your workflows are consistent across projects and make you as efficient of a data scientist as you can possibly be. I think using template files is a great use case. I am so guilty of sometimes going from project to project and copying the majority of a read me file, for example, you know, especially where we talk about, you know, handing our projects off to clients and and how they can use the, you know, the RN package, that we have, you know, in our deliverable to essentially reproduce the environment that we created it with.
And, really, I think Readmes for us could be a template file and a great use case that a simple script like the one that Nicola actually has screenshotted here could help with that. You know, I know that there are packages like use this and dev tools and column and things like that that will actually create scripts for you. And the and the code is is pretty lightweight, so don't be afraid to to do it yourself. I think it's a really awesome trick that can be really underrated as well. Using GitHub repository templates, I will say, is one that that I actually do and that we actually do internally. And that is is hugely helpful. And I know, Eric, you do this, quite heavily, at least within your own, personal development work because you have, GitHub repository templates that spin up your your whole environment that make use of dev containers.
Right? So you can go essentially from project to project and have your your isolated environment and your development environment sort of consistent from project to project. And that is and that's really helpful as you go from project to project to ensure that all of the dependencies that you need to have, you know, within your projects that you're sort of consistently applying the same workflow to, can get installed quickly and consistently. I've seen a lot of articles lately as well from Rami Crispin, who is trying to, I think, accomplish a lot of the same things that you've talked about as well, Eric. And he has some great content online about how to set up your own, GitHub repository templates, I believe, for both our development and Python development using dev containers that have, you know, those those, dev container dot JSON files to sort of specify how you want your v s code environment to look, and then a Docker file to manage all of those dependencies.
And one of the cool things about that as well is if you're interested in leveraging, GitHub code spaces, you can essentially do that, apply that immediately just by clicking a button within GitHub, and it will leverage those dev container assets to spin up that environment in Versus Code in the cloud for you. You don't even need to have Versus Code installed on your own laptop. I know there's a cost associated with that, but I know it's it's pretty cool. And when we think about collaborating across teams, you know, these things become really important. And for us, that's not only collaborating across teams, but it's also handing off our deliverable to to the client at the end of the day and ensuring that they can reproduce our work, and and whatever we've created for them in terms of that deliverable exactly in within the same way that we intended it to be and the way that we created it. So that's really helpful.
And then the section on on linting and styling code, you know, it must be something in the water on this podcast, Eric, because I am also guilty of not utilizing the the linter or the styler r packages. I am very strict. We have an internal handbook about, you know, how to style, our our code and our our Python code, and we try to adhere to that pretty well. And obviously within our our code review process, you know, that that's a component of it to make sure that the code is styled, you know, in line with, sort of our expectations internally. But I was talking to a client recently who were trying to help, set up sort of a good team data science workflow and implement some best practices within their organization and really trying to convince them that code styling is is important for collaboration and streamlining code review, and making all of those processes more efficient.
And one thing that I, I think I failed to mention to them that I need to mention to them is how the lint are in the styler package can expedite that process and maybe also provide, safeguards to ensure that your code is styled in line with the guidelines that you've set up for your organization. And maybe that's not necessarily the default styling that these packages have, but I know at least within 1 or or both of these packages, you can actually apply, you can apply some settings to how you want that code to be styled. And you can make some changes to to how you want code to be styled internally at your organization or or maybe just across your personal development projects. And I think that that is incredibly powerful and incredibly underrated, which I think is a theme of this blog post as well.
I love the section on building quarto extensions. You know, quarto is super hot in the streets right now. We are moving everything to quarto. All new projects are starting with quarto. One thing that that I love as well that I think goes, very aligns very well with, Nicola's section here on building quarto extensions is we unfortunately deliver a lot of PDF reports. And that means that we, use LaTeX quite a bit. In the quarto YAML file, you can actually set up that YAML file and set up, your your latex assets along with that, your dottex files, such that you can have parameters within these latex, assets.
For example, like the title of your report, a background image in your report, subtitle, things like that, that may change on a project to project basis. And you can specify those parameter values within your quarto yml file. And those will get passed to those Latex files. We do that on every single project, you know, because the project title is gonna change. Maybe the image that we'll have on the cover page is going to change. Things like that. But it's it's amazing how well those things, play nicely with each other. So if you're someone out there who finds yourself, you know, delivering a lot of PDF reports through quarto, I would recommend checking out some of the links that Nicola has in this blog post. If you you really wanna spruce up the look and feel of that PDF report, then I think learning a little bit of Lua, can go a long way as well. But there are ample links here, as well as on the quarto website. I think that can help you really make that quarto PDF Latec report, look, you know, nicely branded to your organization or look and feel however you want it to look and feel and customize it nicely. And Nicola gives us a little bit of a preview into some of the the work that she's, posted for herself in in 2024, and I can't wait to see, what comes next from Nicola because it's it's always super relevant, always super helpful, and and I learn a lot.
[00:18:03] Eric Nantz:
Me as well. And there are just so many threads here that I wanna implement both in my open source work and in my day job work. In fact, we're even thinking of, for the internal user group, I maintain, of having some kind of, maybe quarterly newsletter or bimonthly newsletter that can go out highlighting the current events and highlighting in the company initiatives. Maybe I could build that with Chordle, and then I saw a PDF setup like she's done in her earlier efforts. And, yeah, the linting thing, I really gotta get better with it. What's interesting is that in my dev container setup, I do have a hook to do linting automatically in Versus code with a dotlinter, kinda config file. So I'm I'm getting there on the right track. I just have to actually listen to the advice. But I will say the Versus Code is very obvious when it thinks you did a line longer than 80 characters. It will have a big old squiggly next to it, and it will annoy you enough that you probably will wanna change it. But certainly, I wanna make that more automated too and just get better habits. But, yeah, though those two areas and and, yeah, beefing up my templates in general.
Even just today, I have a project where I want to enhance my shiny bookmarking state hack that I've done at the day job. Instead of having to build that up scratch in each app 1 by 1, I want to do a template structure. I can just fold that in because I can't quite make it a package yet, but I just want those functions there every time. So I don't have to just manually copy paste it from that existing repo into the new one, change the path, change whatever. No. Ain't nobody got time for that in 2024. So lots of things here I wanna make use of and really appreciate, Nikola's efforts on this. Couldn't agree more.
And we're gonna shift gears a little bit to a very interesting proposal to maybe a future enhancement to the r language itself that is in the spirit of a recent enhancement that came a couple years ago. And that is making a case for a new pipe assignment operator in R itself. This blog post comes from David Hugh Jones. He's been a very active blogger in the R community, and he makes this case by talking about just kind of the status quo of how R does assignments. You can either have, you know, passing by the value itself, you know, very much saying object, whatever, assign it to, you know, the result of another function or give it a constant or whatever have you.
Sometimes it gets really wieldy when you have a variable that, of course, is being built upon a chain of functions. Right? This is one of the motivations for the pipe operator itself, which, of course, was first brought to light by the Magritter package. And now since r4.1 now has the base r pipe operator built in, which functions largely the same way. So you can construct your, say, data processing steps in a pretty nice kind of streamlined workflow using packages such as dplyr and many others that are pipe friendly. Well, David wants to take this a little up a notch a bit more. What was in the Magritter package? It didn't get a lot of press at the time.
Was it had its own version of a pipe assignment operator where it looks like you're going to do processing on a variable and then print it out kind of interactively. No. It would actually look like you're doing this interactively but actually change the value of the variable you had on the left side. So David is proposing why not do this in our proper. And he proposes this operator as like the less than sign pipe and then greater than sign. So it looks like it's a pipe surrounded by the two signs and then having that translate to a bunch of functions potentially to do something with that operation.
And his rationale is that it could make the code much simpler to read and more expressive and being able to kind of be more succinct with the syntax that you're typing. Now, this is where it gets interesting here. It's one thing to kind of conjecture this, but he decided to take matters in his own hands and actually look at what is being used in the general, you know, set of code that's being shared in open source of where this pipe assignment could be useful. So he actually has authored his own new package called code samples, which apparently has scraped open source code that has been shared on GitHub and Stack Overflow, as well as just the R package examples themselves that come from packages.
And he's done a nice little summary here in the blog post about how many times he has detected code that is pipeable but also pipeable and complex, having like more than a few lines in the pipeline operation. And so the post says there's about between 4 10% of operations could be simplified by an assignment pipe here, and that could be a potential gain. So you might want to look at this for yourself to kind of see in your code, do you have examples that match some of David's examples here, such as what he has from the Stack Overflow questions. Looks like a pretty unwieldy sub setting of a w variable and then trying to translate that into a more friendly, pipeable assignment operator, where it basically streams it down to 4 lines of code, technically, instead of what looks like about 16 or 18 in the previous example.
I will admit I've had a little bit of, gotchas when I've tried this in the Magrita approach before, and I admittedly moved away from it. But I can see where David's coming from here. So it is an interesting proposal. Now for those that are new to the r, you know, project itself, it does take a bit of time for things like this to land in. And, of course, of course, it has to be agreed upon by the r core team and whatnot. So there would be obviously a lot of discussion in play and that this would not happen anytime soon if it was to actually take shape.
But with that said, it is interesting to see what this could look like. So I credit Dave for taking the time to not only put this out into the public domain, but do some interesting kind of data scraping analysis of existing code to see where this might benefit. So definitely something to ponder. I'm kind of in between on this one. I don't really wanna say no to it, but I'm just not sure if I would be the best customer for it. So with that said, definitely food for thought, as I say. Eric, you know, this is one of those things where at first glance, it kinda gives me the heebie jeebies,
[00:25:03] Mike Thomas:
but then it might be, you know, this gets implemented, you know, a couple years from now, and then I finally adopt it, and then I feel maybe someday that, you know, like, don't know how I lived without it in the 1st place. Yeah. I think that happens occasionally. You know, I will give David a lot of credit because he he absolutely makes the case for it in this blog post. Doesn't just make the case for it. He actually writes the code to implement it in, Rsource. So he has, forked the Rsource repository, and then he has has made a commit that, you know, is 1200 lines of code, c code, I think, mostly, essentially, that that implements it and, kudos to him. Hats off. He's putting his money where his mouth is, and I think, absolutely, if this does get adopted, if it's something that the community does eventually want, then he will be the person essentially that that spearheaded this and that did a lot of the work I think to get it in to ours base code. Yeah. I think that probably that the base pipe itself for now is maybe still getting adopted. I would love to see some statistics on how widely used it is. We do still see a lot of code, you know, at our clients that contains the the Maggard or pipe, even new projects that they're working on.
I think maybe people are just just slow to adopt new things so I would imagine that it probably wouldn't be any different if this new pipe, did get implemented. And I can't help thinking that it looks like a person standing there with their hands on their hips. That's the best way that I would that's the best way that I would describe it. Always have to sort of see something, you know, it's like seeing something in a potato chip, but, when something new comes out. But I think it's super interesting. I have a little bit of trouble, you know, the explicit assignment with the arrow is something that I think helps me for code review to be able to really clearly see when an object is getting created.
I know that there are some folks out there, probably more in the in the Tidyverse community, that that, warn a little bit against the use of the assign function, for that that same reason because it's not sort of explicitly showing you where new objects are getting created in your environment as as you read some code. It's that's a function that I've tried to get away from, in in more recent years. So it's gonna be teach their own. I I would say if this is something that that you really feel could help you, then then get in touch with David and and support the cause. And I guess the great thing about open source is is you can use what you wanna use, and you don't have to use what you don't wanna use. Right? I'm sure the assignment arrow is not going away anytime soon. So I'm not very worried, but the, you know, the only place that I think it goes back to maybe our our discussion on Nicola's blog post is is adopting some style guides and sticking to them. Right? So if you want to adopt, this new pipe, if it ever gets merged, as sort of the way that you are going to go about assigning objects internally within your organization, then by all means, you know, merge that into your own style guides and into your own consistent practices. But I think consistency is important. I think ensuring that you are writing code that sets up the reviewers or the collaborators on that code, for success and to make their life easy and efficient, is really important. So that's that's really all I have on this blog post, my 2¢.
I don't feel super strongly one way or the other. I think it's it's really, really interesting. And again, kudos to David for not only making the case, but for going through, the entire process of actually implementing it and what that might look like.
[00:28:57] Eric Nantz:
Yeah. Definitely credit to him for, you know, following through on what this actually could look like. And, you know, I'm always I will never turn down having choice in this space. I mean, there may be others that would have such great benefit of a complex pipe assignment. Have at it. Right? I mean, I don't have to use everything that's in our core or these packages that I use. I just use what's best for the project and what's best for me. So I I will admit it would make me probably have a little harder time doing code review of it as of now. But, heck, maybe a practice makes perfect. Who knows? But, yeah, the assignment operator, you're gonna have to pry that out of my cold dead hands. I use that every single time, and that helps me reason out kind of where the key variables in this pipeline.
And, also, to me, it helps make debugging a little easier at the sake of being a little more verbose. For me, debugging, reviewing are probably the one criteria I'm gonna use as I think about whether I would implement this. But, again, I'll be very interested to see what the community has to say. So I'll be keeping an eye on, say, Mastodon and other, social areas to see what kind of, discussion this, spurs on, but we may be hearing more about this way earlier. Who knows?
[00:30:10] Mike Thomas:
Keep your eye on it. You never know.
[00:30:12] Eric Nantz:
Yes. Yes. And, one as we get to our our last, highlight here, let me preface this by saying when we talk about highlights, we're really more that's a general term for the the areas that we think are most newsworthy and that will probably spur the most discussion in a particular issue because I will admit on the surface, when you when we talk about this last post, I would not call it a highlight per se because we're about to get a little heavy here. But I think there are some thoughts that Mike and I definitely wanna share about this. So first, if you have been using r and in particular frameworks like r markdown for any amount of time, I think if you're not new to the community, you probably know who is most directly responsible for this.
We have to thank for all these amazing innovations in Rmarkdown alongside knitter itself, which honestly made R Markdown possible. Knitter, for those that aren't aware, is kind of the engine behind frameworks like R Markdown in the spirit of sweave or swive that comes built into R itself. But, admittedly, knitter is, in my humble opinion, much easier to build upon, much easier to customize. And once r once the r markdown format came to be, there are just so many lives that have been transformed in professional development, personal development, sharing your data science, you know, blogging with R Markdown and hence, blog down. In essence, what we call the down verse, you might say.
EWAY is directly responsible for this. EWAY had a blog post early this year, just a week ago. And I got wind of this from a, somewhat random post on Mastodon, and I almost didn't believe it. What has happened is that Eway was, unfortunately, given notice. And, unfortunately, posit has, decided to basically, take away his full time role, albeit to pay for the support on a contracting basis of the packages that the R Markdown ecosystem depends on. So, like I said, aforementioned like Knitter, R Markdown itself, and the like. This has come to a shock to many of us in the community, and frankly, it sounds like a surprise to Eway himself in this post.
But to Eway's credit, he has been very gracious in his response to this. He has been very cordial on acknowledging, posit for for all the years that he's been able to work on Knitter, R Markdown, and some of the newer team members that have really stepped up to help him. He he names a few names, but there certainly, there are many people that he has collaborated with, that have made the R Markdown, you know, this EwayDowns ecosystem so powerful in the in this space. So he also mentions that, as I mentioned earlier, the packages that he's been directly responsible for are not gonna be orphaned because if they have anything to do with our markdown in knitter, he is being paid to support that. It's just obviously not a full time job pay anymore.
So the one exception to this would be DT. He is looking. It sounds like Paz is gonna find someone new to that package, but, again, that this is just another, you know, consequence of this. The other interesting part of Eway's post is something I've been kind of observing a little bit. And as someone who has been a fan of, say, Linux and Unix for how many years, it definitely rings true to me a bit. Eway has been exploring kind of a more minimalist approach to some of the software development. I've seen some of the newer packages or newer utilities he's been spinning up. In fact, someone called TinyTek, a way to get latex installed in a very streamlined way onto your system without going through the full bloated like latex system, things like that. And he's also experimenting with other, areas in this, and it sounds like it's more of a philosophical shift.
He's not saying that either approach of, like, a minimalist approach versus an approach like quartile or even Shiny itself that tries to encompass a lot of things. I think they each have their value. Right? It just depends on your philosophy and where you want to take your development. Now, really, the part that hits home here is that it does sound like this was a bit of a surprise. And EWAY, for those that don't know, is a father. He's married. He's got his own house. He is definitely supporting his family. So this is now, to be perfectly frank, an uncertain time for him.
So he has put at the end of the blog post he doesn't do this, in fact, hardly ever. I don't remember last time he's mentioned this, but he did say that until he's able to land on his feet with, say, a new role in whatever industry he chooses to be, He has asked for a bit of help because of the little concern he has with this big change, and he does have a GitHub Sponsors page. Mike and I, full disclosure, are both sponsors of his work. I'm gladly so because we value so much of what he's done for us. But I will say after this post made the rounds, he has received just an earth shattering amount of acknowledgments in the comments of this post. I think we have over 300 or so, at least over 100. It has gone triple digits since I commented on it. And this is just really if you ever had any doubt of what transformative effect Eway has had with his efforts in the R community, you just read the comments on this. There are so many that have said, we owe Eway so much because of our markdown, what it's done for reproducible research, what it's done for their reporting, what it's done for their being able to connect with the community. Like I said, LogDown.
I use LogDown. That's how my podcast site was built, for goodness sakes. Like, there are so many ways that we have been leveraging his utilities. So I will admit I am frankly disappointed at the way this came about. I am, again, giving full credit to Eway for being so gracious in this post. But, certainly, if you've gotten any value out of Eway's open source work, I think everybody makes their own decisions. But to me, the effects he's had on my development, our journey, my adventures in the community. He was my first interview on the our podcast, for goodness sakes. He actually said yes. I put him into a room at this local conference called MBSW with my half answer mic set up, and he talked with me.
And he was so gracious. I felt like I was, you know, meeting Yoda's Luke Skywalker kind of thing and I'll never be at Yoda's level, but I'm just saying that's how I felt. And I just and then subsequent interviews, he has given me some of the most candid thoughts I have ever had on that show is from Eway. So I feel personally very, you know, I consider him a good friend. Obviously, we live far apart. We only get to interact briefly, but he has done so much for me personally. But if you've had any benefit from me, Haysworth, I would just at least consider helping him out in this current time.
But I share what everybody said in in the responses of this, both in the post and on X and Mastodon. So many people's lives have changed because of what Ehue has done. So I sincerely hope he lands on his feet soon. We're thinking of Ehue. If there's anything we can do, obviously, we are we are here to help. But, best of luck to you. But again, the post has so many thoughts that came to mind after reading it. It definitely, like I said, not a highlight in the traditional sense, but the the impact the UA has had on the community cannot be overstated. And I think this blog post clearly shows that. So that that's a lot for me. Mike, what do you got?
[00:38:40] Mike Thomas:
Yeah. I guess I'll start with maybe a couple of calls to action. And the first call to action I would say is is if you have benefited at all, especially financially, you know, if your salary, what you do for a living, includes, you know, leveraging our markdown or ever included leveraging our markdown, and you essentially were paid to to use our markdown, which is obviously free, I would have a hard time, you know, not justifying sponsoring e way, you know, in some way. I know a lot of people have, you know, including you and I, Eric, but the the value that and I I don't think it's unfair to say, you know, monetary value that the community has has probably gained in terms of a lot of employed data scientists, getting a ton of value from our markdown.
I'm not sure that's even quantifiable. So if you can, I don't think there's there's any better use case for making that charitable contribution, to Eway at this time and, you know, his journey? He is, so so that I guess that's my first call to action. And my second call to action would be if you have an opportunity within your organization, whether it be a contract opportunity or or an opportunity to where where you think, you need someone with his skill set as a as an incredible software engineer, reach out to him. Go to rweekly.org. Check out this blog post, that's called by rstudio/posit.
And I'm sure his contact information is at the bottom, of this blog post. I know his GitHub is. He has an about page in that blog post with a contact me link where you can get a hold of him. So if you have opportunities for Eway, reach out to him and see if they may be interested, you know, because he is, he is not necessarily just this miss mythical Yoda. He is a person, you know, with a family as well. And I know at this point in time, especially within tech, there's a lot of, folks that are probably experiencing similar things depending on, you know, where you work.
So it's, it can be a tricky time and unfortunately, you always been been bitten by it as well. So let's, let's try to lift him up, you know, because he has given so much to us, and those of us who are data scientists that use the R Markdown ecosystem and PageDown and BookDown and all these all these different utilities. So I guess those are that's what I wanted to start out with, is a couple calls to action. If you haven't yet really considered sponsoring him or reaching out to him with opportunities if you you see them. Second one, it's it's hard, Eric, not to be emotional, about this. You know?
And layoffs in in business are a reality. You know, one thing that I think is is disappointing to me, and I I don't think I'm I'm being too frank here, is is that it seemed like it was it was quite a quick surprise to Eway after someone who had had worked there for for 10 years and really given so much to, what our studio has been able to create. I have to imagine that, you know, while I understand that a lot of folks worked on on quarto, I have to imagine that a lot of it stands on the shoulders of what, Eway built within the R Markdown ecosystem for many years. So that's that's really disappointing that it it sort of came as a surprise to Eway. I'm very glad that they are at least employing him as a contractor to to be able to work on some things. You know, I guess guess the reality of business is that, you know, if you don't have enough work for someone, then you then you probably only want to pay them for the amount of work that you have for them. So I I don't know if the transition to quarto means that there's there's less work for e way to do on some of the software that he was maintaining and working on. That's just, I guess a guess, and we don't necessarily have that information at this time. But it's it's emotional, I think, more so because of the way that posit's structured as well as a public benefit corporation and not just, not just the regular corporation.
Right. And it's that creates sort of this dynamic between posit and the, our community. That's unique. Right. So I think when, when something takes, when changes take place at posit, we feel the effects of that sort of personally in a way. And that's, that that's very unique. And I think. Open source is really rooted in transparency, right? When we're doing open source work, you know, the, one of the benefits and the really cornerstones of open source software development is that others can see exactly what we're doing and and what we're working on, and they can contribute to it and and try to help and things like that. So I I guess I would like like to some of these changes, for lack of a better word, to be a little bit more transparent within the community. It it feels like maybe there are some walls being built up, between the community and and posit at this point that may not have have previously existed. And I think things that come out like this, you know, I think the loss of a lot of our, posit academy folks, you know, as well as e way now.
I think there's not a lot of, acknowledgement necessarily of it, except maybe by the folks who are directly affected by it. I think it's, it's unusually quiet, from, from their perspective. So I guess I would like to get a better understanding of, of sort of the directions that, that things are going, a better sense of transparency, just because of this, this really unique relationship that the art community has with, you know, what used to be our studio and what's now posit. It's it's something that, you know, is emotional, I think, to a lot of us. I don't think I'm just speaking for myself. So, you know, this was a tough blog post to read. I think EWAY may be taking it better than some of us reading it. He he's taking the high road to every extent here in this blog post. He's extremely grateful, to the folks that he worked with and who employed him for the past 10 years.
He he talks about in the the comments of the post that the in Chinese, the word crisis consists of the words, danger and opportunity. And he he's optimistic about the opportunity part, and he's not very concerned about the danger, and and he really believes it's going to be a blessing in disguise. So I guess if nothing else, I really hope that Eway, you know, finds a new role for him that he's excited about and loves, and and that he he really lands on his feet and and finds, you know, a next great opportunity for himself. So I think we we have a lot of thoughts on this blog post. I'm glad that we we we talked about it. It is, you know, emotional to me. That's that's sort of the best word that that I can use here to try to characterize, how I feel about this blog post. And I I think what Y Hue's work means to us and and what him as a member of the community, means to us.
[00:46:05] Eric Nantz:
Yeah. I echo a lot of that. And you are not alone. And I've seen posts on Mastodon of others kind of a little concerned about, I would say, like you said, the silence on some of these maneuvers because as those aren't aware, we did talk last year. POSIT did do some layoffs that affected some of their open source division as well as other divisions. And, yes, you might be able to hear from other people affected by it directly, but there wasn't a lot said on the, you know, the public facing kind of communication on that. And as of now, I'm not seeing any new response to this either.
I do think this may be a wake up call in a couple senses. Yes. There is an interesting dichotomy where unlike most of the tech industry, posit is a PBC. That puts you into a different lens, in my opinion. Whether you consider that right or wrong, they chose it. Right? They chose to be a PBC. That was JJ's vision that we heard in the keynote a couple years ago at at our studio conf. With that, I think there is and again, Eric's going on his soapbox here. I think there we are owed a little bit more transparency on this as those that are not just fans of open source, fans of data science, but this tooling is immensely important to the work we do, and especially in the open source side of it.
Yes, of course, the commercial projects help, too. But you know, for those that don't know where, Quarto, when it compiles, anything to do with our execution chunks, that's using Knitter under the hood. Guess who wrote Knitter? EWAY. So, like, Quarto is not possible on the r side of it without what EWAY did. So whoever paused it is now funneling more resources into that development, Fair Play, it's their company. Obviously, the interoperability is a big focus for them now. But at the same time, they are indeed standing on shoulders of absolutely giant efforts that EWAY has built here. So, again, I do think a little more transparency is warranted here.
And, you know, it did it has spurred on some concerns about future directions. Again, that could be a whole another podcast in and of itself. I think we'll just leave it as we're interested to see what 2024 holds. But, again, full marks, full credit to EWAY for taking the high road on everything here. And, again, the response from the community has been eye opening to say the least. I misspoke earlier. I mean, ever since the blog post, he has been hearing from many, many people, familiar names, new names, those that maybe are what I call dark matter developers and may not comment very much when they see somebody that has that impact on their daily work or their daily data science journey. They're coming out to say their thanks. So I do think just as open source in general, we need to do a better job of thanking contributors, not just in times like this, but regularly throughout the year because it can pick you up. If if you're an open source developer, you're having a rough time trying to maintain this, just having that pick me up really helps too. We don't just have to wait for an event like this. But with that said, Folker at the Evway for being gracious in this, and I would imagine he is going to be hearing from a lot of people about future opportunities.
I hope he takes what's best for him and his family and certainly will be very curious what the future holds for him. Well said, Eric. Well, yeah. It's hard to transition from that, but that was a jam packed, summary of our, our highlight stories here. But we're gonna close out with a couple of additional fines that we found in this issue. And, yeah. Well, I heard you when you heard me mention in in Nicola's, segment that, you know, she's gonna be pursuing Knicks as a as a tie in with r. Well, of course, our good friend Bruno Rodriguez has been continuing his journey with the NYX, package system and r, and he has a part 8 of his reproducible data science with NYX blog series that gets more into the kind of the the into the weeds of just how open source plays a critical role in Nick's itself.
And in fact, every package in Nick's is simply I'd say simply. I should not say simply. But it is a set of scripts that will take the upstream utilities, bundle it up into a way the next can understand, and then we can install that via the next package manager right then and there on the spot. And, of course, through, Bruno's work on the Rix package, he is trying to make that easier from within R itself. But acknowledging that many R users are very comfortable with the RStudio IDE, he's been trying to make that part of the Rix packaging process too, the Git, version of RStudio in that reproducible project.
So that then is tying into that custom R installation, that custom set of packages that that next package system is exposing. Well, RStudio is a bit of a finicky piece of software to compile and run and install manually, especially around that little thing called macOS. It's a bit wieldy. So in his post, he actually talked about setting up a GoFundMe to help pay for a specialized vendor to assist with that part of the process for macOS users. Unfortunately, it did not hit the funding. But to Bruno's credit, he has actually donated those funds that were, contributed back to the R Foundation itself. So fair play to Bruno for at least helping the R Project benefit from that request.
But I'm gonna conclude this additional fine summary back to Pazit for a second. Pause it? If you don't know now, you know that nicks is becoming a thing in data science. It's not just Bruno. Many others are pursuing this too. It's your ID folks. Maybe you could help out a bit on this too. Just saying. I'll leave it at that. Mike, what do you got?
[00:52:31] Mike Thomas:
There's a call to action. So that's a that's a great find. I still gotta get my hands dirty with nicks at some point these days, and I know Bruno has ample resources available to help me do that. One blog post I found that I think caught a lot of fire this past week and was really interesting was from Emily Riederer on Python Argo nomics. And it talks about essentially if you are an R developer needing to switch to Python for a particular project or, just trying to learn a little bit of Python, to to keep up. Some it talks about how to, map some of the concepts from R into Python to get you up and running. And just sort of as a quick highlight, some of the the tooling that Emily recommends you use to get started with Py, with Python.
For installation, she recommends pyenv package, which allows you to switch back and forth between different versions of Python really easily. So from on one project, you need to use Python 39, and the next project you need to use Python 310. Py n allows you to really easily, switch back and forth from the command line, I believe, between those 2, or multiple different versions of Python. You can set one to be sort of your your global version of Python. I believe that you want to use, as a default. For data analysis, we have heard a lot about the Pandas package, sort of being the equivalent maybe to dplyr. Emily argues that the polars package actually has a syntax that is more similar to dplyr. I agree as well. It's also more performant.
Polars is is a really, really cool package, and it's it's really, really efficient for working with data, large data in particular, but small datasets as well. I think it it works great for, and I think you'll find that the syntax looks very similar to, what you would find in DeepBlider verbs such as select, filter, group by, that they're all mapped sort of between DeepBlider and polar, so it should be pretty familiar to look at the code there. For communication purposes, she references the great tables package, in Python, which is the port of the GT package, by Rich Iannone, I believe, at at posit, from r to Python. It's very new. So if you're a Python user looking to author, really nicely formatted tables within your reports, I would check that one out. And then notebooks, obviously, quarto.
And lastly, in terms of environment management, I think she references the PDM package in Python as maybe being the one that would be equivalent to the r end package, or or the one that she prefers the most. So it's a great sort of list of the different tools that Emily uses to get up and running across all of her Python projects, as well as an explanation of of why she uses them and how they may, be similar or differ from your experience, using similar tooling in r.
[00:55:34] Eric Nantz:
Yeah. Emily does a fantastic job here. This is something that I've struggled with immensely. Even just knowing where to start, I I have a couple of projects that because I'm extending a package in Python for some RSS feed stuff, I'm probably gonna stay in that ecosystem or some of my, you know, very, you know, very in-depth kind of podcasting 2 point o expirations. So having kind of this familiarity of knowing where to go to maybe summarize that podcast database effectively in my portal notebook, go with these packages that give me that kind of our flavor a bit. For somebody that's new to Python, I think that is that is extremely helpful. So this should be your go to post if you're like me and you're just dabbling your toes in the Python for some interoperability work without it being your full time focus. Just knowing where to go first is immensely helpful because there's a wealth of choices out there. And you can go down rabbit holes on environment management and, frankly, bang your head against the wall. I can't tell you how many times I've bought projects at the day job because of the Python VM setup gone horribly wrong. Don't give me a start. Anaconda.
There'd be dragons on my HPC system with that. So, yeah, I'm sure I'm gonna be playing with a lot of what, Emily is recommending here quite a bit.
[00:56:55] Mike Thomas:
Dev containers.
[00:56:57] Eric Nantz:
To have containers. Yes. Exactly. I do have hooks in that in mind, if only the rest of my projects could benefit from that. Well, as you can tell, we've had a we've had a fun banter here. You can tell it's been a few weeks with Mike and I, so we came with our we came with our opinions as they say. But, of course, there is much more to rweekly itself than us just bantering. The full issue has so many more great content, lots of new packages and updated packages, some that really caught my attention on the visualization side of things, especially. So definitely have a check at that at rwig.org. Also, we love to hear from all of you in the community.
And going back to what we mentioned with VeeWay, I'm gonna make this call out now. And I will admit we don't get the whole boost thing very often on this show. I hope that changes from time to time. But for the month of January, if any of you are gracious enough to boost this show with your favorite podcast app or on the podcast index itself, which I have linked to in the show notes, I will funnel that directly to Eway for the month of January. So if you're interested in supporting Eway in a different way, the boost would be a way to I will personally make sure that happens.
So definitely keep that in mind, but, also, we just love hearing from you in general. We have a contact page on the episode show notes, and, also, we are, somewhat active on the social media spheres. I am more, active on Mastodon. My handle is at our podcast at podcast index dot social, sporadically on the x thing of at the r cast, and I'm also on LinkedIn from time to time. Mike, where can they find you?
[00:58:32] Mike Thomas:
LinkedIn is probably the best place to find me. I think my sort of tag there is Michael j Thomas 2. You can also find me on mastodon@[email protected].
[00:58:46] Eric Nantz:
Very nice. Very nice. And like I said, it's great to be back in the swing of things with you. It feels more normal again as we kick off the month of January with this, supersized episode that we just gave you all here. Well, that will do it for us. Like I said, we came with our opinions. Hopefully, you enjoyed it. We'd love to hear from you, but we will be back with another episode of our weekly highlights next week.