Jcrabapple @jcrabapple

126 posts109 participants19 posts today

**Tyler Morgan-Wall** @tylermorganwall@fosstodon.org · 2h

Tyler Morgan-Wall @tylermorganwall@fosstodon.org

for(i in seq_along(data)) value[[i]] = data[[i]][1]
lapply(data, function(x) x[1])
lapply(data, \(x) x[1])
lapply(data, `[`, 1)

#RStats

**Scott Chamberlain** @sckottie@fosstodon.org · 2h

Scott Chamberlain @sckottie@fosstodon.org

Digital biomedical infrastructure all around the country is built on #AWS . However, simply getting data in/out of AWS, and managing access can be difficult to navigate. This friction motivated us to build sixtyfour, an AWS interface that will feel familiar to #rstats folks. Let us (+ @sean) know what you think!

- Blog post: https://recology.info/2025/04/sixtyfour/
- Repo: https://github.com/getwilds/sixtyfour
- Docs: https://getwilds.org/sixtyfour/

Recology · 21hsixtyfour: writing robust code against AWSIntroduction At work (Fred Hutch Cancer Center) we’ve been working on an R package (sixtyfour) over the last ~1.5 years. This post is a quick intro to the package with some learnings about working with AWS. sixtyfour is a science-focused, more humane R interface to AWS. It is built on top of the great paws package maintained by Dyfan Jones, which handles authentication and the low-level work of interacting with AWS services while sixtyfour provides higher-level interfaces for common tasks.

**Josiah** @josi@fosstodon.org · 2h

Josiah @josi@fosstodon.org

Any folks know how to install TinyTex into a Debian / Ubuntu image globally? The installer script (https://yihui.org/tinytex/#installation) doesn't work see related (https://github.com/rstudio/tinytex/issues/415)

#rstats #linux

**Benjamin Smith** @bensstats@fosstodon.org · 2h

Benjamin Smith @bensstats@fosstodon.org

#rstats #DataScience

I'm looking for a good teaching dataset to discuss long and wide format data. Any suggestions?

**ubuntu_touch** @ubuntu_touch@mstdn.social · 3h

ubuntu_touch @ubuntu_touch@mstdn.social

# contexto: Atropellados en Costa Rica

# objetivo: Distritos-Año mas mortales

# proceso : Ordenar de mayor a menor

#Rstats #CostaRica Costa Rica #softwareLibre

GIF

**June Choe** @yjunechoe@fosstodon.org · 5h

June Choe @yjunechoe@fosstodon.org

#rstats Is there an existing tool to automate a repex->rpubs pipeline? My current manual workflow is make a reprex in an .R script, copy the contents over to a .qmd, and use the publish feature in the rstudio IDE.

Sometimes my reprexes get just a tad bit more complex and requires some prose to walk through the steps. In those cases I like publishing them as almost like standalone micro blogposts.

Ex: this reprex doc I made to show how to recover ggrepel coordinates https://rpubs.com/yjunechoe/ggrepel-recover-position

rpubs.comRPubs - Recover ggrepel drawn positions

**R Consortium** @RConsortium@fosstodon.org · 6h

R Consortium @RConsortium@fosstodon.org

R/Medicine 2025 workshop on "Personal R Administration"

Tips, tricks, tweaks, and some hacks for building #datascience dev environments handling new R versions, passwords in your R code, failed package installations and more!

#rstats #opensource

**Ramiro Magno** @ramiro_magno@mastodon.social · 6h *

6h *

Ramiro Magno @ramiro_magno@mastodon.social

#rstats hivemind: would it be too funky to define a package version major.minor.patch.dev as YYYY.MM.DD.VERSION, i.e. map major to year, minor to month, patch to day, and leave the dev component for the actual version..? I'm thinking of a data package whose upstream data releases are versioned based on the date... anyone ever tried such heretic approach? Would CRAN maintainers be okay with this?! ;)

asking for a friend.

**ubuntu_touch** @ubuntu_touch@mstdn.social · 7h

ubuntu_touch @ubuntu_touch@mstdn.social

# objetivo: Plot de demanda nacional de electricidad del Reino Unido

# proceso : Dos procesamientos:
# - Agrupar por año y contar demanda total
# - Demanda cada 30 minutos entre 2014 y 2016

GIF

#Rstats #dplyr #ANZ

**Michael Sumner** @mdsumner@rstats.me · 9h

Michael Sumner @mdsumner@rstats.me

How big is a file, does it even exist?

gdalraster::vsi_stat(<file descriptor>)

vsi_stat("/vsicurl/https://projects.pawsey.org.au/idea-gebco-tif/GEBCO_2024.tif", "size")

the "file" can be anywhere, invoke any of the GDAL virtual file protocols and comprehensive configuration facilities to stat it

open it, GDAL it, seek,ingest,read,move those bytes this is a super powered package on a rich library

#Rstats

https://usdaforestservice.github.io/gdalraster/

**James Balamuta** @coatless@mastodon.social · 11h *

11h *

James Balamuta @coatless@mastodon.social

From app to container in one command: export()

{shinydocker} is a new experimental R package handling Docker containerization for both R & Python Shiny apps. It auto-detects dependencies and app type.

Code: https://github.com/coatless-rpkg/shinydocker
Post: https://blog.thecoatlessprofessor.com/programming/r/rethinking-shiny-containerization-the-shinydocker-experiment/

GIF

#rstats #python #rshiny

**Repository news** @repositories_news@rstats.me · 11h

11h

Repository news @repositories_news@rstats.me

There are 2 new #rstats packages on CRAN:
- 0.00% are in English.
- 0.00% are in other languages than English.
- 0.00% use multiple languages.
- 100.00% do not declare any language.

**Data Science** @datascience@genomic.social · 11h

11h

Data Science @datascience@genomic.social

A curated list of awesome tools to assist development in R programming language. https://indrajeetpatil.github.io/awesome-r-pkgtools/ #rstats #

indrajeetpatil.github.ioAwesome R Package Development Tools

**safest_integer** @safest_integer@mastodon.social · 11h

11h

safest_integer @safest_integer@mastodon.social

```
library(tidyverse)
crossing(x = seq(-1.7, 1.7, len = 200),
y = seq(-1.5, 2, len = 200)) |>
mutate(heart = (x^2 + y^2 - 1)^3 - x^2 * y^3 <= 0) |>
mutate(noise = .01*rnorm(n(), 0, heart + 0.5)) |>
ggplot() + geom_line(aes(x=x,y=y+noise,group=y)) + theme_void()
```

#rstats #generativeart

**Hannah Frick** @hfrick@mastodon.social · 12h

12h

Hannah Frick @hfrick@mastodon.social

rsample 1.3.0 is on CRAN! This release contains a more flexible grouping for bootstrap confidence intervals as well as many tidy dev day contributions as general upkeep. #RStats

https://www.tidyverse.org/blog/2025/04/rsample-1-3-0/

www.tidyverse.orgrsample 1.3.0This release brings more flexibilty to the grouping of bootstrap confidence intervals. It also contains many contributions from the tidyverse developer day.

**Steven Ponce** @sponce1@graphics.social · 12h

12h

Steven Ponce @sponce1@graphics.social

A circular heatmap showing San Juan's monthly temperature patterns from 2014-2024. The visualization arranges months in a circle with years radiating outward. Temperature is displayed through a purple-to-yellow color gradient, with cooler winter temperatures (December-February) in purple and warmer summer temperatures (July-September) in yellow. The pattern reveals San Juan's consistent tropical seasonal cycle.

**MGClaros** @mgclaros@mastodon.world · 12h

12h

MGClaros @mgclaros@mastodon.world

On #regression models #rstats:

1) cleaning method for test data has more impact than for train data

2) best performance does not require same cleaning process in both test and training data

3) regression models should should test several test cleaning pipelines.

https://peerj.com/articles/cs-2793/

PeerJ Computer ScienceThe effects of mismatched train and test data cleaning pipelines on regression models: lessons for practiceData quality problems are present in all real-world, large-scale datasets. Each of these potential problems can be addressed in multiple ways through data cleaning. However, there is no single best data cleaning approach that always produces a perfect result, meaning that a choice needs to be made about which approach to use. At the same time, machine learning (ML) models are being trained and tested on these cleaned datasets, usually with one single data cleaning pipeline applied. In practice, however, data cleaning pipelines are updated regularly, often without retraining of production models. It is therefore common to apply different test (or production) data than the data on which the models were originally trained. The changes in these new test data and the data cleaning process applied can have potential ramifications for model performance. In this article, we show the impact that altering a data cleaning pipeline between the training and testing steps of an ML workflow can have. Through the fitting and evaluation of over 6,000 models, we find that mismatches between cleaning pipelines on training and test data can have a meaningful impact on regression model performance. Counter-intuitively, such mismatches can improve test set performance and potentially alter model selection choices.

**James Balamuta** @coatless@mastodon.social · 17h

17h

James Balamuta @coatless@mastodon.social

Looking forward to a (virtual) homecoming next week! Guest lecturing at my alma mater for STAT 447 @ UIUC on Wednesday, April 9, 6pm Central.

Shiny Without Boundaries: One App, Multiple Destinations

Deploy your #RStats #rshiny apps anywhere: cloud, desktop, browser & beyond.

**Hugh Graham** @hughagraham@fosstodon.org · 21h

21h

Hugh Graham @hughagraham@fosstodon.org

{vrtility} gets its first vignette - how to make HLS composites. I think this example really starts to show how awesome the VRT format can be and how much we can do with it! #rstats #rspatial #gdal
https://permian-global-research.github.io/vrtility/articles/HLS.html

**Dirk Eddelbuettel** @eddelbuettel@mastodon.social · 22h

22h

Dirk Eddelbuettel @eddelbuettel@mastodon.social

nanotime 0.3.12 on CRAN: Maintenance
High-resolution nanosecond time functionality for R
https://dirk.eddelbuettel.com/blog/2025/04/02#nanotime_0.3.12
#rcpp #rstats

Recent searches

Search options

Administered by:

Server stats:

#RStats