diff --git a/UKB_notebooks/ukb-rap-intro-functions-r-basic.ipynb b/UKB_notebooks/ukb-rap-intro-functions-r-basic.ipynb
new file mode 100644
index 0000000..d6b0e2b
--- /dev/null
+++ b/UKB_notebooks/ukb-rap-intro-functions-r-basic.ipynb
@@ -0,0 +1,405 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "jupyter": {
+ "outputs_hidden": true
+ }
+ },
+ "source": [
+ "\n",
+ "# Using bash and loading R packages in R notebooks\n",
+ "***\n",
+ "\n",
+ "This notebook is delivered \"As-Is\". Notwithstanding anything to the contrary, DNAnexus will have no warranty, support, liability or other obligations with respect to Materials provided hereunder.\n",
+ "\n",
+ "[MIT License](https://github.com/dnanexus/OpenBio/blob/master/LICENSE.md) applies to this notebook.\n",
+ "\n",
+ "\n",
+ "***"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Introduction \n",
+ "This R notebook highlights tips and tricks for using bash from the R kernel and for loading R packages."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Jupyterlab app details (launch configuration) \n",
+ "\n",
+ "### Recommended configuration\n",
+ "- runtime: < 20 min\n",
+ "- cluster configuration: `single node`\n",
+ "- recommended instance: `mem1_ssd1_v2_x4`\n",
+ "- cost: < £0.05\n",
+ "\n",
+ "\n",
+ "### Performance comparison\n",
+ "- **mem1_ssd1_v2_x4, single node**: \n",
+ " - runtime: < 20 min\n",
+ " - cost: < £0.05\n",
+ "- mem1_ssd1_v2_x16, single node:\n",
+ " - runtime: < 20 min\n",
+ " - cost: < £0.15\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Installing `R` packages "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "\n",
+ "Example of a few R packages for statistical genetics analysis: \n",
+ "[devtools](https://cran.r-project.org/web/packages/devtools/index.html), \n",
+ "[ggplot2](https://cran.r-project.org/web/packages/ggplot2/index.html), \n",
+ "[tidyverse](https://cran.r-project.org/web/packages/tidyverse/index.html), \n",
+ "[mclust](https://cran.r-project.org/web/packages/mclust/index.html), \n",
+ "[RNOmni](https://cran.r-project.org/web/packages/RNOmni/index.html), \n",
+ "[ISLR](https://cran.r-project.org/web/packages/ISLR/index.html), \n",
+ "[xgboost](https://cran.r-project.org/web/packages/xgboost/index.html), \n",
+ "[pacman](https://cran.r-project.org/web/packages/pacman/index.html), \n",
+ "[ivpack](https://cran.r-project.org/web/packages/ivpack/index.html), \n",
+ "[meta](https://cran.r-project.org/web/packages/meta/index.html), \n",
+ "[MendelianRandomization](https://cran.r-project.org/web/packages/MendelianRandomization/index.html), \n",
+ "[TwoSampleMR](https://github.com/MRCIEU/TwoSampleMR), \n",
+ "[randomForest](https://cran.r-project.org/web/packages/randomForest/randomForest.pdf), \n",
+ "[Ggrepel](https://cran.r-project.org/web/packages/ggrepel/index.html), \n",
+ "[reshape2](https://cran.r-project.org/web/packages/reshape2/index.html)\n",
+ "\n",
+ "Many packages are installed in the base image of Jupyterlab and can be checked with `installed.packages()` \n",
+ "\n",
+ "\n",
+ "### List already installed R packages on UKB RAP"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "installed.packages()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Check if a package is already installed"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "pkg = c(\n",
+ " \"remotes\",\n",
+ " \"tidyverse\",\n",
+ " \"mclust\",\n",
+ " \"RNOmni\",\n",
+ " \"ISLR\",\n",
+ " \"xgboost\",\n",
+ " \"pacman\",\n",
+ " \"ivpack\",\n",
+ " \"meta\",\n",
+ " \"MendelianRandomization\",\n",
+ " \"randomForest\",\n",
+ " \"ggrepel\",\n",
+ " \"reshape2\"\n",
+ ")\n",
+ "\n",
+ "# List out packages to be installed\n",
+ "pkg[!(pkg %in% installed.packages()[,\"Package\"])]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Install additonal packages\n",
+ "\n",
+ "Uncomment the install commands if you are comfortable with the library license and want to install and run the parts notebook that depend on the library."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#install.packages(c(\"tidyverse\"), repos = \"https://cloud.r-project.org\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Load libraries (installed packages)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "library(tidyverse, quietly = TRUE)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Install packages from Github repositories\n",
+ "\n",
+ "Uncomment the install commands if you are comfortable with the library license and want to install and run the parts notebook that depend on the library."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#remotes::install_github(\"rstudio/shiny\")\n",
+ "library(shiny, quietly = TRUE)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Make a SNAPSHOT of your installed R packages \n",
+ "\n",
+ "Once you have installed the packages you want to reuse in your next session, you can create snapshots which can be loaded as startup of Jupyter and will carry any additional packages installed on this worker. Please look into [documentation](https://documentation.dnanexus.com/user/jupyter-notebooks#environment-snapshots) for more details."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Using `bash` from the R kernel \n",
+ "\n",
+ "`system` lets you execute bash commands through the R kernel"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# View current directory in the UKB RAP project\n",
+ "system(\"dx pwd\", intern = TRUE)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Create a test file and read it, as if it was from a bash terminal"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "system(\"echo 'This is a test' > test.txt\", intern = TRUE)\n",
+ "system(\"head test.txt\", intern = TRUE)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Upload file to UKB RAP\n",
+ "system(\"dx upload test.txt\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# List all files and folders in the current directory in the UKB RAP project\n",
+ "system(\"dx ls\", intern = TRUE)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Remove test.txt file from UKB RAP\n",
+ "system(\"dx rm test.txt\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Uploading data to the UKB RAP \n",
+ "\n",
+ "Using public data from the `MendelianRandomization` R package"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "library(MendelianRandomization)\n",
+ "\n",
+ "betas <- cbind(ldlc, hdlc, trig) %>% as.data.frame()\n",
+ "betases <- cbind(ldlcse, hdlcse, trigse) %>% as.data.frame()\n",
+ "\n",
+ "snp_df <- cbind(betas,betases)\n",
+ "snp_df$id <- paste0(\"snp_\",1:nrow(snp_df))\n",
+ "\n",
+ "head(snp_df)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "snp_df %>% write_csv(\"snp_df.csv\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Remove any previous version of snp_df from UKB RAP if it exists\n",
+ "system(\"dx rm snp_df.csv\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "system(\"dx upload snp_df.csv\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Read data into R "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Option 1: Download a file from a RAP project storage to a JupyterLab environment storage and load into the session"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "system(\"dx download '/Showcase metadata/field.tsv'\", intern = TRUE)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "field_info <- read.table(\"field.tsv\", sep = \"\\t\", header = TRUE, fill = TRUE)\n",
+ "head(field_info)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Option 2: Stream file from the project directly to be read in R dataframe with `dxfuse`\n",
+ "\n",
+ "[`dxfuse`](https://github.com/dnanexus/dxfuse) is filesystem that allows users access to the DNAnexus storage system.\n",
+ "\n",
+ "When there is no need to download files to the local environment of this worker. Recommended for larger files.\n",
+ "\n",
+ "*Notes*:\n",
+ "- `dxfuse` is for read-only. \n",
+ "- After mounting, the file system structure remains fixed. Any changes made externally in the project (e.g. a new file is uploaded in the project) are not reflected in the local worker."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "field_info <- read.table(\"/mnt/project/Showcase metadata/field.tsv\", sep = \"\\t\", header = TRUE, fill = TRUE)\n",
+ "head(field_info)"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.9.5"
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": true,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {},
+ "toc_section_display": true,
+ "toc_window_display": false
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}